noorall commented on code in PR #25551:
URL: https://github.com/apache/flink/pull/25551#discussion_r1896526945
##########
flink-runtime/src/main/java/org/apache/flink/runtime/deployment/ConsumedSubpartitionContext.java:
##########
@@ -118,11 +141,23 @@ public static ConsumedSubpartitionContext
buildConsumedSubpartitionContext(
partitions[partitionRange.getStartIndex()]),
partitionIdToShuffleDescriptorIndexMap.get(
partitions[partitionRange.getEndIndex()]));
- checkState(partitionRange.size() == shuffleDescriptorRange.size());
- numConsumedShuffleDescriptors += shuffleDescriptorRange.size();
+ checkState(
+ partitionRange.size() == shuffleDescriptorRange.size()
+ &&
!consumedShuffleDescriptorToSubpartitionRangeMap.containsKey(
+ shuffleDescriptorRange));
consumedShuffleDescriptorToSubpartitionRangeMap.put(
shuffleDescriptorRange, subpartitionRange);
}
+ // For ALL_TO_ALL, there might be overlaps in shuffle descriptor to
subpartition range map:
+ // [0,10] -> [2,2], [0,5] -> [3,3], so we need to count consumed
shuffle descriptors after
+ // merging.
+ int numConsumedShuffleDescriptors = 0;
+ List<IndexRange> mergedConsumedShuffleDescriptor =
+ IndexRangeUtil.mergeIndexRanges(
+
consumedShuffleDescriptorToSubpartitionRangeMap.keySet());
Review Comment:
> Will this result list contain only one range?
It may contain multiple ranges, for example, in skewed join optimization,
the task may consume partition range to sub partition range: [9,10] ->[1,1]
[1,2] ->[2,2], which means that the task exactly processes a portion of the
data for subpartition group 1 and subpartition group 2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]