kbuci commented on PR #10965: URL: https://github.com/apache/hudi/pull/10965#issuecomment-2062622033
> @kbuci Please include heartbeat for clustering commit as well. Also, treat clustering and logcompaction as removable plans so rollback for them can happen in the ingestion itself. Considering that how do we create heart beats? @suryaprasanna Do you recall why logcompaction execution/rollback is different than compaction, in the sense that unlike compaction execution - log compaction won't retry a failed/inflight log compact plan and will instead completely roll it back - Lazy clean rollback of failed writes is allowed to rollback log compact instants I assume this is because we want to avoid a "stuck" log compact plan for preventing compaction from being scheduled, but just wanted to confirm, since @nsivabalan had the same question as well. The reason I ask is that (as per my understanding) this behavior of log compact will make it more tricky for us to schedule log compact plans and reliably defer execution to an async job. This is since if clean's failedWritesRollback can rollback log compact instants, then it can rollback a log compact plan (.requested file) before it has the chance to be "picked up" and executed by an async job. We can handle this situation by adding heartbeating (the same way as compact) and updating failedWritesRollback to only try rolling back a log compact instant if it is inflight (and ignore it if it's just in .requested state), though that still has the following consequence we should keep in mind: - If the async job is disabled or delayed (due to a configuration or orchestration issue), the log compact plan (.requested file) will remain in the timeline @suryaprasanna @nsivabalan @danny0405 (tagging all commenters) I was wondering if you had any opinions/suggestions on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
