Re: [PR] [FLINK-36575][runtime] ExecutionVertexInputInfo supports consuming subpartition groups [flink]

via GitHub Tue, 24 Dec 2024 00:17:17 -0800


noorall commented on code in PR #25551:
URL: https://github.com/apache/flink/pull/25551#discussion_r1896526945



##########
flink-runtime/src/main/java/org/apache/flink/runtime/deployment/ConsumedSubpartitionContext.java:
##########
@@ -118,11 +141,23 @@ public static ConsumedSubpartitionContext 
buildConsumedSubpartitionContext(
                                     
partitions[partitionRange.getStartIndex()]),
                             partitionIdToShuffleDescriptorIndexMap.get(
                                     partitions[partitionRange.getEndIndex()]));
-            checkState(partitionRange.size() == shuffleDescriptorRange.size());
-            numConsumedShuffleDescriptors += shuffleDescriptorRange.size();
+            checkState(
+                    partitionRange.size() == shuffleDescriptorRange.size()
+                            && 
!consumedShuffleDescriptorToSubpartitionRangeMap.containsKey(
+                                    shuffleDescriptorRange));
             consumedShuffleDescriptorToSubpartitionRangeMap.put(
                     shuffleDescriptorRange, subpartitionRange);
         }
+        // For ALL_TO_ALL, there might be overlaps in shuffle descriptor to 
subpartition range map:
+        // [0,10] -> [2,2], [0,5] -> [3,3], so we need to count consumed 
shuffle descriptors after
+        // merging.
+        int numConsumedShuffleDescriptors = 0;
+        List<IndexRange> mergedConsumedShuffleDescriptor =
+                IndexRangeUtil.mergeIndexRanges(
+                        
consumedShuffleDescriptorToSubpartitionRangeMap.keySet());

Review Comment:
   > Will this result list contain only one range?
   
   It may contain multiple ranges, for example, in skewed join optimization, 
the task may consume partition range to sub partition range: [9,10] ->[1,1] 
[1,2] ->[2,2], which means that the task exactly processes a portion of the 
data for subpartition group 1 and subpartition group 2.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-36575][runtime] ExecutionVertexInputInfo supports consuming subpartition groups [flink]

Reply via email to