bvaradar commented on issue #1062: [HUDI-294] Delete Paths written in Cleaner 
plan needs to be relative to partition-path
URL: https://github.com/apache/incubator-hudi/pull/1062#issuecomment-561057985
 
 
   > > Thanks @leesf : This is a good starting point. You would need to make 
further changes in
   > > 
   > > 1. HoodieCleanHelper.getDeletePaths need to return only relative paths.
   > > 2. HoodieCopyOnWriteTable.scheduleClean to return only relative paths.
   > > 3. (1) and (2) will ensure HoodieCleanClient.scheduleClean will store 
cleaner plans with relative paths.
   > > 4. IncrementalTimelineSyncFileSystemView.addCleanInstant needs to handle 
relative partition paths
   > 
   > Hi @bvaradar . Currently the `HoodieCleanHelper.getDeletePaths` (point 1) 
and `HoodieCopyOnWriteTable.scheduleClean` (point 2) return relative path 
already. Also the currently 
`IncrementalTimelineSyncFileSystemView.addCleanInstant` only handles relative 
path. Please correct me if I am wrong.
   > 
   > And I addressed other comments.
   
   @leesf : Sorry, my bad regarding scheduleClean comments. I confused 
HoodieCleanerPlan with HoodieCleanMetadata. HoodieCleanerPlan is newly added as 
part of 0.5.1 and stores relative paths and do not require migration. 
HoodieCleanMetadata though requires migration. I looked at  
IncrementalTimelineSyncFileSystemView.addCleanInstant and believe this would 
need more handling. This method calls a common method : 
IncrementalTimelineSyncFileSystemView.removeFileSlicesForPartition which is 
shared by rollback/restore code and assumes full path. 
   
   ```
    FileStatus[] statuses = paths.stream().map(p -> {
           FileStatus status = new FileStatus();
           status.setPath(new Path(p));
           return status;
         }).toArray(FileStatus[]::new);
   ```
   To make it easy, you might need to create absolute paths in addCleanInstant 
before calling removeFileSlicesForPartition. Can you check this part of the 
code to see if it makes sense.
   
   Thanks for doing this change.
   
   Balaji.V

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to