Lokesh Lingarajan created HUDI-6724:
---------------------------------------

             Summary: Initializing prevInstance to 
HoodieTimeline.INIT_INSTANT_TS to avoid partial reading of first commit
                 Key: HUDI-6724
                 URL: https://issues.apache.org/jira/browse/HUDI-6724
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Lokesh Lingarajan


Since object based incr jobs now have batching with in the commit, we can 
end-up in a situation for the first commit where prevInstance is same as 
startInstance according to existing code for batches within the first commit. 

In this scenario when we incremental query rows > prevInstance, we will skip 
the first commit as startInstance is also pointing to the same commit.

This is due to defaulting prevInstance to startInstance in 
generateQueryInfo API. 

Fix is to have this default to HoodieTimeline.INIT_INSTANT_TS so batching can 
continue on the first commit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to