Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2002#discussion_r16373718
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -451,10 +451,56 @@ private[spark] object Utils extends Logging {
       /**
        * Get a temporary directory using Spark's spark.local.dir property, if 
set. This will always
        * return a single directory, even though the spark.local.dir property 
might be a list of
    -   * multiple paths.
    +   * multiple paths.  If the SPARK_LOCAL_DIRS environment variable is set, 
then this will return
    +   * a directory from that variable.
        */
       def getLocalDir(conf: SparkConf): String = {
    -    conf.get("spark.local.dir",  
System.getProperty("java.io.tmpdir")).split(',')(0)
    +    getOrCreateLocalRootDirs(conf)(0)
    +  }
    +
    +  /**
    +   * Gets or creates the directories listed in spark.local.dir or 
SPARK_LOCAL_DIRS,
    +   * and returns only the directories that exist / could be created.
    +   */
    +  private[spark] def getOrCreateLocalRootDirs(conf: SparkConf): 
Array[String] = {
    +    val isYarn = java.lang.Boolean.valueOf(
    +      System.getProperty("SPARK_YARN_MODE", 
conf.getenv("SPARK_YARN_MODE")))
    --- End diff --
    
    There's also some code in the Yarn ApplicationMaster and ExecutorLauncher 
classes that sets the `spark.local.dir` property, too:
    
    ```
        // Setup the directories so things go to yarn approved directories 
rather
        // then user specified and /tmp.
        System.setProperty("spark.local.dir", getLocalDirs())
    ```
    
    The getLocalDirs() code is duplicated in several places now, too:
    
    ```
    [joshrosen Spark (master)]$ git grep -A4 "def getLocalDirs"
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala: 
 private def getLocalDirs(): String = {
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala- 
   // Hadoop 0.23 and 2.x have different Environment variable names for the
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala- 
   // local dirs, so lets check both. We assume one of the 2 is set.
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala- 
   // LOCAL_DIRS => 2.X, YARN_LOCAL_DIRS => 0.23.X
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala- 
   val localDirs = Option(System.getenv("YARN_LOCAL_DIRS"))
    --
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala:  
private def getLocalDirs(): String = {
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala-  
  // Hadoop 0.23 and 2.x have different Environment variable names for the
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala-  
  // local dirs, so lets check both. We assume one of the 2 is set.
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala-  
  // LOCAL_DIRS => 2.X, YARN_LOCAL_DIRS => 0.23.X
    
yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala-  
  val localDirs = Option(System.getenv("YARN_LOCAL_DIRS"))
    --
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala:
  private def getLocalDirs(): String = {
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala-
    // Hadoop 0.23 and 2.x have different Environment variable names for the
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala-
    // local dirs, so lets check both. We assume one of the 2 is set.
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala-
    // LOCAL_DIRS => 2.X, YARN_LOCAL_DIRS => 0.23.X
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala-
    val localDirs = Option(System.getenv("YARN_LOCAL_DIRS"))
    --
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala: 
 private def getLocalDirs(): String = {
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala- 
   // Hadoop 0.23 and 2.x have different Environment variable names for the
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala- 
   // local dirs, so lets check both. We assume one of the 2 is set.
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala- 
   // LOCAL_DIRS => 2.X, YARN_LOCAL_DIRS => 0.23.X
    
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala- 
   val localDirs = Option(System.getenv("YARN_LOCAL_DIRS"))
    ```
    
    I'm not sure whether this code actually affects anything, since it sets the 
`spark.local.dir` system property _after_ the SparkConf has already been 
created, and `spark.local.dir` is only consumed through SparkConf, not through 
directly reading the system property.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to