prashantwason opened a new pull request, #18027:
URL: https://github.com/apache/hudi/pull/18027

   ### Describe the issue this Pull Request addresses
   
   Closes HUDI-2619
   
   When building a rollback plan using markers, data files that were deleted 
during `finalizeWrite()` can get included in the rollback requests. This 
results in the metadata table receiving delete operations for non-existent 
files.
   
   ### Summary and Changelog
   
   This PR adds a filter to check if data files actually exist before including 
them in the marker-based rollback plan.
   
   **Changes:**
   - Added file existence check in 
`MarkerBasedRollbackStrategy.getRollbackRequests()` after collecting rollback 
requests from markers
   - Requests with empty `filesToBeDeleted` (e.g., APPEND operations) are 
passed through without the check
   - Only requests where the file actually exists are included in the final 
rollback plan
   - Updated test expectations in `TestMarkerBasedRollbackStrategy` to reflect 
the new filtering behavior
   
   ### Impact
   
   - No public API changes
   - Improves correctness of rollback operations by preventing metadata table 
from receiving deletes for non-existent files
   - Minor performance impact due to additional file existence checks, but this 
ensures data consistency
   
   ### Risk Level
   
   low - The change adds a defensive check that filters out invalid rollback 
requests. Existing tests validate the behavior.
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to