Victsm commented on a change in pull request #29855:
URL: https://github.com/apache/spark/pull/29855#discussion_r497077157



##########
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java
##########
@@ -373,6 +427,54 @@ public ManagedBuffer next() {
     }
   }
 
+  /**
+   * Dummy implementation of merged shuffle file manager. Suitable for when 
push-based shuffle
+   * is not enabled.
+   */
+  private static class NoOpMergedShuffleFileManager implements 
MergedShuffleFileManager {
+
+    @Override
+    public StreamCallbackWithID receiveBlockDataAsStream(PushBlockStream msg) {
+      throw new UnsupportedOperationException("Cannot handle shuffle block 
merge");
+    }
+
+    @Override
+    public MergeStatuses finalizeShuffleMerge(FinalizeShuffleMerge msg) throws 
IOException {
+      throw new UnsupportedOperationException("Cannot handle shuffle block 
merge");
+    }
+
+    @Override
+    public void registerApplication(String appId, String user) {

Review comment:
       It's a bit unclear at this moment, especially on that part of what's 
needed in different schedulers.
   Our current approach for determining the merged shuffle file directory path 
is the following:
   
   1. The implementation of MergedShuffleFileManager (RPC handler for block 
push requests) will be initialized with a relative directory path pattern, 
which is relative to the list of executor local dirs (a common concept across 
all schedulers).
   2. The actual path for storing the merged shuffle files for a given 
application on a given host is then decided based on the local dirs and the 
materialization of the relative path pattern with the appId and user ID.
   
   The assumption is that once we know the local dirs for a given app, the 
remaining portion of the directory path to the merged shuffle files will be 
mostly the same across different applications except the app Id and the user Id 
portion in the path.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to