[
https://issues.apache.org/jira/browse/HUDI-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinoth Chandar updated HUDI-6909:
---------------------------------
Sprint: Sprint 2024-03-25, Sprint 2023-04-26 (was: Sprint 2024-03-25)
> Handle `_hoodie_operation` field in the new HoodieFileGroupReader
> -----------------------------------------------------------------
>
> Key: HUDI-6909
> URL: https://issues.apache.org/jira/browse/HUDI-6909
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Ethan Guo
> Assignee: Vinoth Chandar
> Priority: Blocker
> Fix For: 1.0.0
>
>
> Goal: The meta field `_hoodie_operation` should be properly handled in the
> HoodieFileGroupReader for CDC use cases:
> # the record needs to be ignored for snapshot queries.
> # for streaming queries, we should emit it out for downstream operators to
> do retractions.
> Spark integration with NewHoodieParquetFileFormat and HoodieFileGroupReader
> should handle `_hoodie_operation` correctly.
>
> Currently, only Flink reader fully supports `_hoodie_operation`. By
> supporting `_hoodie_operation` in HoodieFileGroupReader, all engines can
> handle `_hoodie_operation` properly.
>
> See the following for more context:
> [https://github.com/apache/hudi/pull/8721]
> [https://github.com/apache/hudi/pull/9624#discussion_r1340788640]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)