[GitHub] [spark] Victsm commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

GitBox Tue, 27 Oct 2020 10:28:25 -0700


Victsm commented on a change in pull request #30062:
URL: https://github.com/apache/spark/pull/30062#discussion_r512886244




##########
File path: 
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
##########
@@ -94,6 +95,9 @@
   static final String STOP_ON_FAILURE_KEY = "spark.yarn.shuffle.stopOnFailure";
   private static final boolean DEFAULT_STOP_ON_FAILURE = false;
 
+  // Used by shuffle merge manager to create merged shuffle files.
+  protected static final String APP_BASE_RELATIVE_PATH = 
"usercache/%s/appcache/%s/";

Review comment:
       This goes back to the conversation we had in #29855 
(https://github.com/apache/spark/pull/29855/files#r497077157)
   
   The assumption right now is that we can use appId and user Id to properly 
distinguish the local dirs given different cluster schedulers.
   In the case of YARN, this information is provided through the following API 
during application registration:
   
https://github.com/apache/spark/blob/f284218dae23bf91e72e221943188cdb85e13dac/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergedShuffleFileManager.java#L58-L66
   
   During executor registration, the `blockManagerDir` local dir paths 
information is further provided to `RemoteBlockPushResolver` through the 
following API:
   
https://github.com/apache/spark/blob/f284218dae23bf91e72e221943188cdb85e13dac/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergedShuffleFileManager.java#L68-L77
   Since block merge dirs are per-application instead of per-executor, we only 
pick the local dir information for the first executor of a given application 
registered with the shuffle service.
   
   Having the common dir path pattern provided by `YarnShuffleService` to 
`RemoteBlockPushResolver` would then allow `RemoteBlockPushResolver` to 
construct the local dirs used for storing merged shuffle files from the 
`blockManagerDir` local dirs provided during executor registration.
   
   As @otterc mentioned, we don't really need the common dir path pattern if we 
just store the full local dir paths in the hashmap.
   This would also simplify the application registration API in 
`MergeShuffleFileManager` to remove the user ID field, which could make it more 
applicable to other cluster schedulers.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] Victsm commented on a change in pull request #30062: [SPARK-32916][SHUFFLE] Implementation of shuffle service that leverages push-based shuffle in YARN deployment mode

Reply via email to