Ethan Guo created HUDI-3796:
-------------------------------
Summary: Implement layout to filter out uncommitted log files
without reading the log blocks
Key: HUDI-3796
URL: https://issues.apache.org/jira/browse/HUDI-3796
Project: Apache Hudi
Issue Type: Improvement
Components: writer-core
Reporter: Ethan Guo
Fix For: 0.12.0
Related: HUDI-3637
At high level, getLatestFileSlices() is going to fetch the latest file slices
for committed base files and filter out any file slices with the uncommitted
base instant time. The uncommitted log files in the latest file slices may be
included, and they are skipped while doing log reading and merging, i.e., the
logic in "AbstractHoodieLogRecordReader".
We can use log instant time instead of base instant time for the log file name
so that it is able to filter out uncommitted log files without reading the log
blocks beforehand.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)