mridulm commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r680622186
##########
File path:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##########
@@ -188,49 +228,70 @@ private AppShufflePartitionInfo
getOrCreateAppShufflePartitionInfo(
AppShufflePartitionInfo newAppShufflePartitionInfo(
String appId,
int shuffleId,
+ int shuffleMergeId,
int reduceId,
File dataFile,
File indexFile,
File metaFile) throws IOException {
- return new AppShufflePartitionInfo(appId, shuffleId, reduceId, dataFile,
+ return new AppShufflePartitionInfo(appId, shuffleId, shuffleMergeId,
reduceId, dataFile,
new MergeShuffleFile(indexFile), new MergeShuffleFile(metaFile));
}
@Override
- public MergedBlockMeta getMergedBlockMeta(String appId, int shuffleId, int
reduceId) {
+ public MergedBlockMeta getMergedBlockMeta(
+ String appId,
+ int shuffleId,
+ int shuffleMergeId,
+ int reduceId) {
AppShuffleInfo appShuffleInfo = validateAndGetAppShuffleInfo(appId);
+ AppShuffleMergePartitionsInfo partitionsInfo =
appShuffleInfo.shuffles.get(shuffleId);
+ if (null != partitionsInfo && partitionsInfo.shuffleMergeId >
shuffleMergeId) {
Review comment:
Note: Here, `partitionsInfo.shuffleMergeId < shuffleMergeId` case is
getting implicitly handled (as the index file wont exist in the check below).
Instead, let us make the '>' check into '!=' to be stricter.
+CC @Ngone51 I am reviewing all use of `msg.shuffleMergeId` based on the
issue you noticed below.
PTAL
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]