[
https://issues.apache.org/jira/browse/HUDI-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-7585:
----------------------------
Description: The table schema resolver needs to read schema from the data
files (base or log files) to see whether _hoodie_operation field is present for
Flink CDC use cases. This can cause overhead of reading data file footers
multiple times. We should see if we can store or simplify the Flink CDC format
in Hudi 1.0 (thus no need of ). (was: The table schema resolver needs to read
schema from the data files (base or log files) to see whether )
> Avoid reading log files for resolving schema for _hoodie_operation field
> ------------------------------------------------------------------------
>
> Key: HUDI-7585
> URL: https://issues.apache.org/jira/browse/HUDI-7585
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: Ethan Guo
> Assignee: Jing Zhang
> Priority: Major
> Fix For: 1.0.0
>
>
> The table schema resolver needs to read schema from the data files (base or
> log files) to see whether _hoodie_operation field is present for Flink CDC
> use cases. This can cause overhead of reading data file footers multiple
> times. We should see if we can store or simplify the Flink CDC format in
> Hudi 1.0 (thus no need of ).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)