bvaradar opened a new pull request #942: [WIP] [HUDI-137] Fix state transitions 
for Hudi cleaning action
URL: https://github.com/apache/incubator-hudi/pull/942
 
 
   
   Before this change, Cleaner performs cleaning of old file versions and then 
stores the deleted files in .clean files.
   With this setup, we will not be able to track file deletions if a cleaner 
fails after deleting files but before writing .clean metadata.
   This is fine for regular file-system view generation but Incremental 
timeline syncing relies on clean/commit/compaction metadata to keep a 
consistent file-system view.
   
   Cleaner state transitions is now similar to that of compaction.
   
   1. Requested : HoodieWriteClient.scheduleClean() selects the list of files 
that needs to be deleted and stores them in metadata
   2. Inflight : HoodieWriteClient marks the state to be inflight before it 
starts deleting
   3. Completed : HoodieWriteClient marks the state after completing the 
deletion according to the cleaner plan
   
   There will be followup PRs after this :
   1. HUDI-294 for making cleaner stats use relative paths.
   2. HUDI-137 for similar handling for Rollback
   3. HUDI-80  for incrementalize cleaning
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to