[
https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883985#comment-17883985
]
Lin Liu commented on HUDI-6909:
-------------------------------
In our previous discussion, we decided that the CDC will not be done at this
moment. We can do that, but first. we need to support CDC with file group
reader, right?
> Handle `_hoodie_operation` field in the new HoodieFileGroupReader
> -----------------------------------------------------------------
>
> Key: HUDI-6909
> URL: https://issues.apache.org/jira/browse/HUDI-6909
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Ethan Guo (old account)
> Assignee: Lin Liu
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Goal: The meta field `_hoodie_operation` should be properly handled in the
> HoodieFileGroupReader for CDC use cases:
> # the record needs to be ignored for snapshot queries.
> # for streaming queries, we should emit it out for downstream operators to
> do retractions.
> Spark integration with NewHoodieParquetFileFormat and HoodieFileGroupReader
> should handle `_hoodie_operation` correctly.
>
> Currently, only Flink reader fully supports `_hoodie_operation`. By
> supporting `_hoodie_operation` in HoodieFileGroupReader, all engines can
> handle `_hoodie_operation` properly.
>
> See the following for more context:
> [https://github.com/apache/hudi/pull/8721]
> [https://github.com/apache/hudi/pull/9624#discussion_r1340788640]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)