Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/20799#discussion_r174273641
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala
---
@@ -247,6 +241,18 @@ private[yarn] class ExecutorRunnable(
System.getenv().asScala.filterKeys(_.startsWith("SPARK"))
.foreach { case (k, v) => env(k) = v }
+
+ sparkConf.getExecutorEnv.foreach { case (key, value) =>
+ if (key == Environment.CLASSPATH.name()) {
+ // If the key of env variable is CLASSPATH, we assume it is a path
and append it.
+ // This is kept for backward compatibility and consistency with
hadoop
+ YarnSparkHadoopUtil.addPathToEnvironment(env, key, value)
+ } else {
+ // For other env variables, simply overwrite the value.
+ env(key) = value
+ }
+ }
--- End diff --
@jerryshao I think there is a potential issue with this change - it allows
for users to (incorrectly) specify SPARK_LOG_URL_STDERR, SPARK_LOG_URL_STDOUT :
which should be generated by driver. The section "// Add log urls" above this
code snippet.
Note, this is an existing bug in the code regarding the same - if the same
variables had been present in driver env, they would have overridden the
generated value's.
Would be good to fix this issue as well as part of this change.
Solution would be to move the block for '// Add log urls' below this
current block
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]