[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

GitBox Thu, 24 Jun 2021 22:43:49 -0700


otterc commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r658159431




##########
File path: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java
##########
@@ -222,7 +223,7 @@ public void sendMergedBlockMetaReq(
     handler.addRpcRequest(requestId, callback);
     RpcChannelListener listener = new RpcChannelListener(requestId, callback);
     channel.writeAndFlush(
-      new MergedBlockMetaRequest(requestId, appId, shuffleId, 
reduceId)).addListener(listener);
+      new MergedBlockMetaRequest(requestId, appId, shuffleId, 
shuffleSequenceId, reduceId)).addListener(listener);

Review comment:
       > I still prefer passing shuffleSequenceId in the fetch protocol as it 
feels simpler and straightforward to me. If there are other strong reasons for 
not doing so other than the ones mentioned earlier, 
   
   As I mentioned in my earlier 
[comment](https://github.com/apache/spark/pull/33034#discussion_r657623667), 
IMO these are strong enough reasons to not modify existing protocol 
   - Adding a field that is not really needed. This change also trickles down 
to the various other classes and APIs like `BlockStoreClient`, 
`ExternalBlockStoreClient`, `OneForOneBlocker`, `ExternalBlockHandler`, etc.  
   Instead the alternative is just a self-contained simple change to track this 
metadata in `RemoteBlockPushResolver`.
   
   - Adding this field in the message suggests that the fetch request could be 
for older `shuffleSequenceId`.  You said we can add a clarifying comment, but 
if there is no need to make such a change then what's the point of adding a 
comment to explain all this.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] otterc commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

Reply via email to