kbuci commented on PR #10965:
URL: https://github.com/apache/hudi/pull/10965#issuecomment-2062622033

   > @kbuci Please include heartbeat for clustering commit as well. Also, treat 
clustering and logcompaction as removable plans so rollback for them can happen 
in the ingestion itself. Considering that how do we create heart beats?
   
   @suryaprasanna Do you recall why logcompaction execution/rollback is 
different than compaction, in the sense that unlike compaction execution
   - log compaction won't retry a failed/inflight log compact plan and will 
instead completely roll it back
   - Lazy clean rollback of failed writes is allowed to rollback log compact 
instants
   I assume this is because we want to avoid a "stuck" log compact plan for 
preventing compaction from being scheduled, but just wanted to confirm, since 
@nsivabalan had the same question as well.
   
   The reason I ask is that (as per my understanding) this behavior of log 
compact will make it more tricky for us to schedule log compact plans and 
reliably defer execution to an async job. This is since if clean's 
failedWritesRollback can rollback log compact instants, then it can rollback a 
log compact plan (.requested file) before it has the chance to be "picked up" 
and executed by an async job. We can handle this situation by adding 
heartbeating (the same way as compact) and updating failedWritesRollback to 
only try rolling back a log compact instant if it is inflight (and ignore it if 
it's just in .requested state), though that still has the following consequence 
we should keep in mind:
   - If the async job is disabled or delayed (due to a configuration or 
orchestration issue), the log compact plan (.requested file) will remain in the 
timeline 
   
   @suryaprasanna @nsivabalan @danny0405 (tagging all commenters) I was 
wondering if you had any opinions/suggestions on this?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to