otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r657550779



##########
File path: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##########
@@ -222,7 +223,7 @@ public void sendMergedBlockMetaReq(
     handler.addRpcRequest(requestId, callback);
     RpcChannelListener listener = new RpcChannelListener(requestId, callback);
     channel.writeAndFlush(
-      new MergedBlockMetaRequest(requestId, appId, shuffleId, 
reduceId)).addListener(listener);
+      new MergedBlockMetaRequest(requestId, appId, shuffleId, 
shuffleSequenceId, reduceId)).addListener(listener);

Review comment:
       It is not an issue with the name of this field or just a matter of 
adding a comment. You are proposing to add a field to these messages suggesting 
that these messages will be used to fetch shuffle data of older 
`shuffleSequenceId` which is never the case. So if the reducers are always 
fetching latest `shuffleSequenceId` corresponding to a shuffleId then the 
server can keep just some metadata information and figure this out itself. As 
mentioned offline this metadata will be less than few kbs. 
   
   > Isn't it similar like the mapId set in the fetch message for regular 
shuffle?
   
   I don't see any similarities with this and the mapId in the regular fetch 
message.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to