Ethan Guo created HUDI-3796:
-------------------------------

             Summary: Implement layout to filter out uncommitted log files 
without reading the log blocks
                 Key: HUDI-3796
                 URL: https://issues.apache.org/jira/browse/HUDI-3796
             Project: Apache Hudi
          Issue Type: Improvement
          Components: writer-core
            Reporter: Ethan Guo
             Fix For: 0.12.0


Related: HUDI-3637

At high level, getLatestFileSlices() is going to fetch the latest file slices 
for committed base files and filter out any file slices with the uncommitted 
base instant time.  The uncommitted log files in the latest file slices may be 
included, and they are skipped while doing log reading and merging, i.e., the 
logic in "AbstractHoodieLogRecordReader".

We can use log instant time instead of base instant time for the log file name 
so that it is able to filter out uncommitted log files without reading the log 
blocks beforehand.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to