bvaradar opened a new pull request #942: [WIP] [HUDI-137] Fix state transitions for Hudi cleaning action URL: https://github.com/apache/incubator-hudi/pull/942 Before this change, Cleaner performs cleaning of old file versions and then stores the deleted files in .clean files. With this setup, we will not be able to track file deletions if a cleaner fails after deleting files but before writing .clean metadata. This is fine for regular file-system view generation but Incremental timeline syncing relies on clean/commit/compaction metadata to keep a consistent file-system view. Cleaner state transitions is now similar to that of compaction. 1. Requested : HoodieWriteClient.scheduleClean() selects the list of files that needs to be deleted and stores them in metadata 2. Inflight : HoodieWriteClient marks the state to be inflight before it starts deleting 3. Completed : HoodieWriteClient marks the state after completing the deletion according to the cleaner plan There will be followup PRs after this : 1. HUDI-294 for making cleaner stats use relative paths. 2. HUDI-137 for similar handling for Rollback 3. HUDI-80 for incrementalize cleaning
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
