nsivabalan opened a new issue, #18827:
URL: https://github.com/apache/hudi/issues/18827

   ## Problem
   
   When a rollback is retried on a Merge-on-Read (MOR) table at table version 6 
— either after a failure or because the rollback was re-driven — the rollback 
log files can collide because they inherit the previously-existing log file's 
write token (often `UNKNOWN_WRITE_TOKEN` = `1-0-1`) instead of using a per-task 
write token. This produces multiple files with the same name across rollback 
attempts, causing overwrites and metadata-table inconsistencies.
   
   ## Root cause
   
   In `RollbackHelperV1#maybeDeleteAndCollectStats`, the pre-computed log 
version map (`(latestVersion, existingWriteToken)`) was being applied to the 
new rollback log writer like this:
   
   ```java
   writerBuilder.withLogVersion(preComputedVersion.getLeft())
       .withLogWriteToken(preComputedVersion.getRight()); // <-- overrides 
per-task token
   ```
   
   This overrode the per-task write token that 
`CommonClientUtils.generateWriteToken(taskContextSupplier)` had just set, so on 
rollover (`HoodieLogFile.rollOver(rolloverLogWriteToken)`) the new rollback log 
inherited the existing log's token. Retried rollbacks ran with the same token 
and ended up writing files with identical names.
   
   ## Fix
   
   When an existing log file is found for the file group being rolled back:
   
   - Keep the per-task write token from `CommonClientUtils.generateWriteToken` 
(don't override it).
   - Explicitly bump the writer's log version to `latest + 1` (so the new file 
lands at a fresh version even when the per-task token alone wouldn't 
differentiate it).
   - Only apply the bump in `doDelete=true` paths; in `doDelete=false` paths 
(stats-only) we let `WriterBuilder.build()` rediscover the existing version so 
the downstream `storage.getPathInfo` lookup still resolves.
   
   Also: change the "no existing log file" sentinel in `preComputeLogVersions` 
from `(LOGFILE_BASE_VERSION, UNKNOWN_WRITE_TOKEN)` to `(LOGFILE_BASE_VERSION, 
null)` so the call site can distinguish "no log file" from "a real log file 
whose token happens to equal UNKNOWN_WRITE_TOKEN."
   
   ## Impact
   
   - Repeated rollback attempts no longer create colliding log files.
   - Metadata table no longer sees conflicting write tokens for rollback log 
files.
   - File-slice ordering remains consistent across retries.
   
   ## Reproduction
   
   A test that exercises this on master: force table version 6 on a MOR table, 
write a base commit followed by an updates commit (leaving the second commit in 
inflight state), execute a rollback via `MergeOnReadRollbackActionExecutor`, 
then replay the rollback after restoring the inflight commit's timeline files + 
marker directory. With the bug, the second rollback's log files collide with 
the first attempt's; with the fix, each rollback attempt produces 
uniquely-named log files.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to