alexeykudinkin commented on a change in pull request #4178:
URL: https://github.com/apache/hudi/pull/4178#discussion_r766290137



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/clustering/run/strategy/MultipleSparkJobExecutionStrategy.java
##########
@@ -91,13 +92,11 @@ public MultipleSparkJobExecutionStrategy(HoodieTable table, 
HoodieEngineContext
     // execute clustering for each group async and collect WriteStatus
     JavaSparkContext engineContext = 
HoodieSparkEngineContext.getSparkContext(getEngineContext());
     // execute clustering for each group async and collect WriteStatus
-    Stream<JavaRDD<WriteStatus>> writeStatusRDDStream = 
clusteringPlan.getInputGroups().stream()
+    Stream<JavaRDD<WriteStatus>> writeStatusRDDStream = 
allOf(clusteringPlan.getInputGroups().stream()
         .map(inputGroup -> runClusteringForGroupAsync(inputGroup,
             clusteringPlan.getStrategy().getStrategyParams(),
             
Option.ofNullable(clusteringPlan.getPreserveHoodieMetadata()).orElse(false),
-            instantTime))
-        .map(CompletableFuture::join);
-
+            instantTime)).collect(Collectors.toList())).join().stream();

Review comment:
       Can you please re-format this snippet to stack up callers so that it's 
easy to attribute what method is invoked on each expression? 
   
   Like following: 
   
   ```
   allOf(
     clusteringPlan.getInputGroups().stream()
       .map(inputGroup -> runClusteringForGroupAsync(inputGroup,
               clusteringPlan.getStrategy().getStrategyParams(),
               
Option.ofNullable(clusteringPlan.getPreserveHoodieMetadata()).orElse(false),
               instantTime)
       )
       .collect(Collectors.toList()))
       .join()
       .stream()
   ```
   
   It's very hard to understand what is going on there right now 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to