Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/3233#discussion_r24356487
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -318,6 +314,13 @@ private[spark] class Client(
env("SPARK_YARN_STAGING_DIR") = stagingDir
env("SPARK_USER") =
UserGroupInformation.getCurrentUser().getShortUserName()
+ // Propagate SPARK_HOME to the containers. This is needed for pyspark
to
+ // work, since the executor's PYTHONPATH is built based on the location
+ // of SPARK_HOME.
+
sparkConf.getOption("spark.home").orElse(sys.env.get("SPARK_HOME")).foreach {
path =>
+ env("SPARK_HOME") = path
+ }
--- End diff --
It's not entirely related to the patch. But without this I just could not
run pyspark apps on yarn as part of testing. I couldn't for the life of me find
out how SPARK_HOME propagated to executors before.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]