[
https://issues.apache.org/jira/browse/HUDI-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-8163:
---------------------------------
Labels: pull-request-available (was: )
> Add an iterator api for HoodieUnMergedLogRecordScanner
> ------------------------------------------------------
>
> Key: HUDI-8163
> URL: https://issues.apache.org/jira/browse/HUDI-8163
> Project: Apache Hudi
> Issue Type: Improvement
> Components: reader-core
> Reporter: Danny Chen
> Assignee: Danny Chen
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.1.0
>
>
> HoodieUnMergedLogRecordScanner extends from
> AbstractHoodieLogRecordReader, the later is used for merging multiple log
> files within one file slice. While for HoodieUnMergedLogRecordScanner, there
> is no merging, all we need to do is resolving the hoodie record and return it
> back.
> The current impl is kind of hacky, it allows the
> HoodieUnMergedLogRecordScanner to pass around a
> LogRecordScannerCallback, each resolved record in applied with this callback,
> the singular callback impl in Flink side is just to put the record into
> another queue, and pop the records again with this queue.
> Let's refactor this:
> 1. add a iterator API for HoodieUnMergedLogRecordScanner, there might need
> some code refactoring also to the AbstractHoodieLogRecordReader to make the
> code more de-coupled;
> 2. in Flink side, use this new iterator instead to decipher the records.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)