prashantwason commented on PR #18089:
URL: https://github.com/apache/hudi/pull/18089#issuecomment-4030073724

   @nsivabalan Good question. Here are the scenarios where a rollback can start 
for an ongoing commit:
   
   1. **Heartbeat expiry**: Writer A starts a commit but gets slow (e.g., long 
GC pause, network issues, slow I/O). Its heartbeat expires. Writer B (or a 
table service) sees the stale inflight commit and initiates a rollback to clean 
it up. Meanwhile, Writer A recovers and tries to complete its commit.
   
   2. **Multi-writer setups**: In OCC-based multi-writer scenarios, one writer 
may decide to rollback another writer's inflight commit (e.g., via lazy 
rollback of failed instants). If the original writer is still actively working 
on that commit, both operations can proceed concurrently.
   
   3. **Manual intervention**: An operator manually triggers a rollback of what 
appears to be a stuck commit, but the writer is actually still making progress.
   
   In all these cases, without this PR, the rollback and the commit proceed 
without detecting the conflict, which can lead to data inconsistency (e.g., the 
commit completes writing data that the rollback is simultaneously cleaning up). 
This PR adds conflict detection so that when the writer tries to complete its 
commit, it detects the concurrent rollback targeting its own commit and fails 
fast with a `HoodieWriteConflictException`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to