danny0405 commented on pull request #2433: URL: https://github.com/apache/hudi/pull/2433#issuecomment-758372866
The file check of each task is useless because even if a task of the source has no data for some time interval, the checkpoint still can trigger normally. So all task checkpoint successfully does not mean there is data. (I have solved this in mu #step1 PR though) There is no need to checkpoint the write status in KeyedWriteProcessOperator, because we can not start a new instant if the last instant failes, the more proper/simple way is to retry the commit actions several times and trigger failover if still fails. BTW, IMO, we should finish RFC-24 first as fast as possible, it sloves many bugs and has many improvements. After that i would add a compatible pipeline and this PR can apply there, and i can help to review. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
