Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/6360#discussion_r31770788
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -899,8 +938,21 @@ object Client extends Logging {
// Distribution-defined classpath to add to processes
val ENV_DIST_CLASSPATH = "SPARK_DIST_CLASSPATH"
- // Subdirectory where the user's hadoop config files will be placed.
- val LOCALIZED_HADOOP_CONF_DIR = "__hadoop_conf__"
+ // Subdirectory where the user's Spark and Hadoop config files will be
placed.
+ val LOCALIZED_CONF_DIR = "__spark_conf__"
+
+ // Name fo the file in the conf archive containing Spark configuration.
+ val SPARK_CONF_FILE = "__spark_conf__.properties"
+
+ // Subdirectory where the user's python files (not archives) will be
placed.
+ val LOCALIZED_PYTHON_DIR = "__pyfiles__"
+
+ // Key in SparkConf where to find the executors' PYTHONPATH. This cannot
be set using
+ // `SparkConf.setExecutorEnv`, because that would cause it to propagate
to the python
+ // code via the configuration, and then override the process's own
environment when
+ // launching workers. Since it contains variables that are expanded by
YARN, that cannot
+ // happen.
--- End diff --
this is a little long-winded. Did you mean to say that there may be
variables in this python path that are substituted on the executor side?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]