SteNicholas commented on PR #8884:
URL: https://github.com/apache/hudi/pull/8884#issuecomment-1576388603

   @zhuanshenbsj1, IMO, streaming read must skip clustering instants because 
there are many duplicates case. For example, the timeline is commit1, 
replacecommit1.requested,commit2 and there is a job start reading from commit1. 
At this time, the job fails and restarts from checkpoint and the replacecommit1 
is completed. After restarting, the IncrementInputSplits will read the instants 
including commit1, replacecommit1 and commit2, which is different from the 
instants before failing including commit1 and commit2.
   cc @danny0405


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to