Ngone51 commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r677161044
##########
File path:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##########
@@ -648,14 +809,24 @@ public void onData(String streamId, ByteBuffer buf)
throws IOException {
// to disk as well. This way, we avoid having to buffer the entirety of
every blocks in
// memory, while still providing the necessary guarantee.
synchronized (partitionInfo) {
- Map<Integer, AppShufflePartitionInfo> shufflePartitions =
+ Map<Integer, Map<Integer, AppShufflePartitionInfo>>
shuffleMergePartitions =
appShuffleInfo.partitions.get(partitionInfo.shuffleId);
- // If the partitionInfo corresponding to (appId, shuffleId, reduceId)
is no longer present
- // then it means that the shuffle merge has already been finalized. We
should thus ignore
- // the data and just drain the remaining bytes of this message. This
check should be
- // placed inside the synchronized block to make sure that checking the
key is still
- // present and processing the data is atomic.
- if (shufflePartitions == null ||
!shufflePartitions.containsKey(partitionInfo.reduceId)) {
+ // Older shuffleMergePartitions gets cleaned up when newer
shuffleMergeId gets created
Review comment:
> 3. Lets say now shuffleMergeId moves to 3 then
shuffleMergePartitions(2) = STALE_SHUFFLE_PARTITIONS and
shuffleMergePartitions(1) will be removed.
We don't clean the `shuffleMergePartitions` but its entry(e.g.,
shuffleMergePartitions(1)), right?
What I doubt is that `shuffleMergePartitions` shouldn't be null but
`shuffleMergePartitions(shuffleMergeId)` can be null.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]