YuweiXiao commented on code in PR #6680:
URL: https://github.com/apache/hudi/pull/6680#discussion_r990605731


##########
hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java:
##########
@@ -138,7 +143,20 @@ public BaseHoodieTableFileIndex(HoodieEngineContext 
engineContext,
     this.engineContext = engineContext;
     this.fileStatusCache = fileStatusCache;
 
-    doRefresh();
+    /**
+     * The `shouldRefresh` variable controls how we initialize the 
TableFileIndex:

Review Comment:
   I removed `isAllInputFileSlicesCached ` and have following logic to check is 
all file slices cached:
   
   ```
   if (cachedAllPartitionPaths == null) {
         return false;
       }
   return cachedAllPartitionPaths.stream().allMatch(p -> 
cachedAllInputFileSlices.containsKey(p));
   ```
   
   Basically, we check if all partitions are loaded. Then we check if all 
partitions is contained in the `cachedAllInputFileSlices`. It should be cleaner 
instead of maintaining a separate flag variable.



##########
hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java:
##########
@@ -179,15 +197,125 @@ public void close() throws Exception {
   }
 
   protected List<PartitionPath> getAllQueryPartitionPaths() {
+    if (cachedAllPartitionPaths != null) {
+      return cachedAllPartitionPaths;
+    }
+
+    loadAllQueryPartitionPaths();
+    return cachedAllPartitionPaths;
+  }
+
+  private void loadAllQueryPartitionPaths() {
     List<String> queryRelativePartitionPaths = queryPaths.stream()
         .map(path -> FSUtils.getRelativePartitionPath(basePath, path))
         .collect(Collectors.toList());
 
-    // Load all the partition path from the basePath, and filter by the query 
partition path.
-    // TODO load files from the queryRelativePartitionPaths directly.
-    List<String> matchedPartitionPaths = getAllPartitionPathsUnchecked()
-        .stream()
-        .filter(path -> 
queryRelativePartitionPaths.stream().anyMatch(path::startsWith))
+    this.cachedAllPartitionPaths = 
listQueryPartitionPaths(queryRelativePartitionPaths);
+
+    // If the partition value contains InternalRow.empty, we query it as a 
non-partitioned table.
+    this.queryAsNonePartitionedTable = 
this.cachedAllPartitionPaths.stream().anyMatch(p -> p.values.length == 0);

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to