yihua commented on pull request #4821:
URL: https://github.com/apache/hudi/pull/4821#issuecomment-1063758898


   > > @danny0405 : here is the scenario. Lets say multi-writer is enabled and 
hence rollbacks are lazy. there is a commit C5 which got committed to MDT, but 
crashed before committing to data table. and the user restarts the pipeline. 
due to multi-writer, there are more commits added, but rollback will be 
triggered lazily by cleaner. Lets say cleaner configs are very easy (say 100 
commits). So, by this time, archival could clean up the partially failed commit.
   > 
   > Can we disable lazy cleaner for restarts/bootstrap then ? The lazy 
cleaning makes sense for normal commit but it just make things complex for 
boostrap/restarts and it even does not gains much.
   
   For multi-writer scenario, we must have lazy cleaning since the job cannot 
tell if the inflight commit is due to failed write or actual inflight commit 
from another writer.  So the job relies on the heartbeat timeout for 
determining the failed writes and lazily cleans up failed commits later. This 
is the whole point of having the guard we are discussing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to