Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21468#discussion_r197971721
  
    --- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
    @@ -811,10 +811,18 @@ private[spark] class Client(
     
         // Finally, update the Spark config to propagate PYTHONPATH to the AM 
and executors.
         if (pythonPath.nonEmpty) {
    -      val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath)
    +      val pythonPathStr = (sys.env.get("PYTHONPATH") ++=: pythonPath)
             .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR)
    -      env("PYTHONPATH") = pythonPathStr
    -      sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr)
    +      val newValue =
    +        if (env.contains("PYTHONPATH")) {
    +          env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR  + 
pythonPathStr
    +        } else {
    +          pythonPathStr
    +        }
    +      env("PYTHONPATH") = newValue
    +      if (!sparkConf.getExecutorEnv.toMap.contains("PYTHONPATH")) {
    --- End diff --
    
    I see that the previous code was overriding this in the executor env; but 
perhaps the right thing here is to concatenate them, otherwise the executor 
might be missing the py4j/pyspark stuff this class adds.
    
    So, basically, what you want is:
    
    - driver: env.get(pp) ++ sys.env.get(pp) ++ pythonPath
    - executor: pythonPath ++ sparkConf.getExecutorEnv(pp)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to