vamshipasunuru1 opened a new pull request, #18887:
URL: https://github.com/apache/hudi/pull/18887

   ## Change Description
   
   When `MARKERS.type` is absent, 
`MarkerBasedRollbackUtils.getAllMarkerPaths()` first tries `DIRECT` markers and 
catches `IOException | IllegalArgumentException` to fall back to 
`TIMELINE_SERVER_BASED`. This catch is too broad.
   
   **The problem:** A transient HDFS error (e.g., "Server too busy" / 
`RetriableException`) is also an `IOException`. When it's caught, the code 
falls back to the timeline server marker path, which looks in a different 
location and finds **0 markers** — causing the rollback to skip deleting data 
files and leaving **orphan files** on the table.
   
   ## Fix
   
   Split the exception handling:
   - **`IOException`** → propagate, let rollback fail and retry (transient HDFS 
failures should not silently produce an incorrect rollback)
   - **`IllegalArgumentException`** → keep the fallback (this indicates a 
marker path format mismatch, the existing intended behavior)
   
   ## Testing
   
   Added `TestMarkerBasedRollbackUtils` with unit tests covering:
   - `IOException` is propagated (no fallback)
   - `IllegalArgumentException` still falls back to timeline server markers
   
   ## Risk
   
   Low — the change only affects the `MARKERS.type`-absent code path, and only 
for transient IO failures. The `IllegalArgumentException` fallback behavior is 
preserved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to