prashantwason opened a new pull request, #18035: URL: https://github.com/apache/hudi/pull/18035
### Describe the issue this Pull Request addresses When updating MDT with cleanMetadata, failed delete files were being included in the update. As a result, subsequent clean runs don't pick up these files for deletion. For a tier-1 table receiving lots of updates, this resulted in a partition with > 1M files. This fix addresses JIRA issue: [HUDI-3766](https://issues.apache.org/jira/browse/HUDI-3766) ### Summary and Changelog **Summary:** Exclude failed delete files from MDT updates so they can be retried in subsequent clean runs. **Changes:** 1. **CleanActionExecutor.java**: When a `FileNotFoundException` is caught during file deletion, return `true` instead of `false`. If a file to be deleted is not found, treat it as a success since there is nothing to clean up on the FileSystem. By returning success, the entry is removed from MDT. 2. **HoodieTableMetadataUtil.java**: In `convertMetadataToFilesPartitionRecords()`, filter out files that are in the `failedDeleteFiles` list before creating the MDT update record. This ensures failed deletes are excluded so they can be retried. 3. **TestHoodieTableMetadataUtil.java**: Added test `testFailedDeletesAreExcludedFromCleanMetadataRecords()` to verify that failed deletes are excluded from MDT updates. ### Impact This change affects the behavior of clean operations when files fail to delete. Previously, failed deletes would still be recorded in MDT, causing them to never be retried. With this fix, failed deletes are excluded from the MDT update, allowing subsequent clean runs to pick them up. ### Risk Level low - The change is localized to the clean action executor and metadata table utility. The fix ensures consistency between what is actually deleted and what is recorded in MDT. ### Documentation Update none ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Enough context is provided in the sections above - [x] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
