[GitHub] spark pull request #20799: [SPARK-23635][YARN] AM env variable should not ov...

mridulm Tue, 13 Mar 2018 13:31:05 -0700

Github user mridulm commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20799#discussion_r174273641
  
    --- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala
 ---
    @@ -247,6 +241,18 @@ private[yarn] class ExecutorRunnable(
     
         System.getenv().asScala.filterKeys(_.startsWith("SPARK"))
           .foreach { case (k, v) => env(k) = v }
    +
    +    sparkConf.getExecutorEnv.foreach { case (key, value) =>
    +      if (key == Environment.CLASSPATH.name()) {
    +        // If the key of env variable is CLASSPATH, we assume it is a path 
and append it.
    +        // This is kept for backward compatibility and consistency with 
hadoop
    +        YarnSparkHadoopUtil.addPathToEnvironment(env, key, value)
    +      } else {
    +        // For other env variables, simply overwrite the value.
    +        env(key) = value
    +      }
    +    }
    --- End diff --
    
    @jerryshao I think there is a potential issue with this change - it allows 
for users to (incorrectly) specify SPARK_LOG_URL_STDERR, SPARK_LOG_URL_STDOUT : 
which should be generated by driver. The section "// Add log urls" above this 
code snippet.
    
    Note, this is an existing bug in the code regarding the same - if the same 
variables had been present in driver env, they would have overridden the 
generated value's.
    Would be good to fix this issue as well as part of this change.
    
    Solution would be to move the block for '// Add log urls' below this 
current block



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20799: [SPARK-23635][YARN] AM env variable should not ov...

Reply via email to