zhangyue19921010 commented on code in PR #12601:
URL: https://github.com/apache/hudi/pull/12601#discussion_r1915928949


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/cluster/strategy/BaseConsistentHashingBucketClusteringPlanStrategy.java:
##########
@@ -128,7 +129,7 @@ protected Stream<HoodieClusteringGroup> 
buildClusteringGroupsForPartition(String
             .build();
       }).collect(Collectors.toList()));
     }
-    return ret.stream();
+    return Pair.of(ret.stream(), true);

Review Comment:
   This boolean value is used to indicate whether all candidate fileslice under 
the current partition have been processed. The external world uses this as a 
basis to determine whether the current partition is added to missingPartitions.
   
   For `BaseConsistentHashingBucketClusteringPlanStrategy` it always return 
true.
   
   For other `PartitionAwareClusteringPlanStrategy` like 
`SparkSizeBasedClusteringPlanStrategy` it may return false.
   Such as , 100 fileSlices are passed in, but due to the limitation of 
`writeConfig.getClusteringMaxNumGroups()`, only 10 of them are processed. In 
this case, false should be returned.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to