zhouyejoe commented on a change in pull request #32007:
URL: https://github.com/apache/spark/pull/32007#discussion_r644436170



##########
File path: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala
##########
@@ -153,6 +189,59 @@ private[spark] class DiskBlockManager(conf: SparkConf, 
deleteFilesOnStop: Boolea
     }
   }
 
+  /**
+   * Get the list of configured local dirs storing merged shuffle blocks 
created by executors
+   * if push based shuffle is enabled. Note that the files in this directory 
will be created
+   * by the external shuffle services. We only create the merge_manager 
directories and
+   * subdirectories here because currently the shuffle service doesn't have 
permission to
+   * create directories under application local directories.
+   */
+  private def createLocalDirsForMergedShuffleBlocks(conf: SparkConf): 
Option[Array[File]] = {
+    if (Utils.isPushBasedShuffleEnabled(conf)) {
+      // Will create the merge_manager directory only if it doesn't exist 
under any local dir.
+      val localDirs = Utils.getConfiguredLocalDirs(conf)
+      var mergeDirCreated = false;
+      for (rootDir <- localDirs) {
+        val mergeDir = new File(rootDir, MERGE_MANAGER_DIR)
+        if (mergeDir.exists()) {
+          logDebug(s"Not creating $mergeDir as it already exists")
+          mergeDirCreated = true
+        }
+      }

Review comment:
       The original logic:
   For loop each local dir, if there is any merge_dir created by other 
executors, this executor will not create any local dirs.
   Potential issue:
   Suppose Executor 1 created the merge_dirs under /tmp/[a-c]. Executor2 
launched slightly later, it got local dirs /tmp/[b-d]. Executor2 would not 
create merge_dirs in any of them as it found that another executor has created 
the merge dir in /tmp/b. But if the executor registration message from 
Executor2 gets handled prior to Executor1's, shuffle service will use the 
/tmp/[b-d] as the merge dirs. However, the merge_dir under /tmp/d has not been 
created by Executor2.
   
   The update logic:
   Every Executor should try to create the merge_dir under the local dirs, 
which makes sure no matter what ExecutorRegister message gets received in 
shuffle service, it is guaranteed that the merge_dir is there with permission 
770.
   @otterc
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to