otterc commented on a change in pull request #30062:
URL: https://github.com/apache/spark/pull/30062#discussion_r510537792



##########
File path: 
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
##########
@@ -172,7 +178,9 @@ protected void serviceInit(Configuration conf) throws 
Exception {
       }
 
       TransportConf transportConf = new TransportConf("shuffle", new 
HadoopConfigProvider(conf));
-      blockHandler = new ExternalBlockHandler(transportConf, 
registeredExecutorFile);
+      shuffleMergeManager = new RemoteBlockPushResolver(transportConf, 
APP_BASE_RELATIVE_PATH);

Review comment:
       On the client side, we have added a configuration 
`spark.shuffle.push.enabled`. This is per application and by default it is 
going to be `false`. An app would decide whether it wants to run with push 
enabled or not. 
   If we make this configurable on the server-side, how would we enforce that 
an application doesn't try push-based shuffle when the servers don't support it?
   Also, I think changing server side configurations is a bit of hassle. We run 
Spark over Yarn and for every server side change we need restart all the 
NodeManagers.
   Do you have any concerns that the `shuffleMergeManager` interferes with 
regular shuffle?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to