[GitHub] [hudi] prashantwason commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

GitBox Fri, 18 Sep 2020 15:23:57 -0700


prashantwason commented on a change in pull request #2064:
URL: https://github.com/apache/hudi/pull/2064#discussion_r491219275




##########
File path: 
hudi-client/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java
##########
@@ -180,14 +181,14 @@ public CleanPlanner(HoodieTable<T> hoodieTable, 
HoodieWriteConfig config) {
   }
 
   /**
-   * Scan and list all paritions for cleaning.
+   * Scan and list all partitions for cleaning.
    * @return all partitions paths for the dataset.
    * @throws IOException
    */
   private List<String> getPartitionPathsForFullCleaning() throws IOException {
     // Go to brute force mode of scanning all partitions
-    return FSUtils.getAllPartitionPaths(hoodieTable.getMetaClient().getFs(), 
hoodieTable.getMetaClient().getBasePath(),
-        config.shouldAssumeDatePartitioning());
+    return 
HoodieMetadata.getAllPartitionPaths(hoodieTable.getMetaClient().getFs(),

Review comment:
       With flags for various operations there is greater chance of eventual 
inconsistency - async operations may have created/deleted files which are 
unknown to metadata yet. 
   
   If for certain operations we really need to skip metadata, it will be 
cleaner to change the API to reflect that. Example:
      HoodieMetadata.getAllPartitionPaths(...., boolean shouldValidate);     
When shouldValidate is true, metadata validation if forced leading to file 
listing being used to return results.
   
   This way we force all listing operation to use single code path which can be 
optimized later on.    




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] prashantwason commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

Reply via email to