Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/6752#discussion_r32441036
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -1183,6 +1187,29 @@ object Client extends Logging {
YarnSparkHadoopUtil.addPathToEnvironment(env,
Environment.CLASSPATH.name, path)
/**
+ * Returns the path to be sent to the NM for a local path string.
+ *
+ * This method uses two configuration values:
+ *
+ * - spark.yarn.config.localPath: a string that identifies a portion of
the input path that may
+ * only be valid in the local process.
+ * - spark.yarn.config.clusterPath: a string with which to replace the
local path. This may
+ * contain, for example, env variable references, which will be
expanded by the NMs when
+ * starting containers.
--- End diff --
I find "local" to be very confusing -- local from the viewpoint of which
node? Its really "local" from the viewpoint of the gateway node. (maybe it
ends up being the same thing, since this is always run on the gateway, but just
looking at this in isolation its not clear.)
So I'm not crazy about these names either, but about something like:
`spark.yarn.config.gatewayPath` and
`spark.yarn.config.gatewayClusterReplacementPath`?
Also can you expand the initial line of the doc slightly to include more
from the overall PR description? Eg., something like "Returns the path to be
sent to the NM for building the command line to launch spark containers. The
NM will perform variable substitution of the expanded path". I know this in
your description of `clusterPath` but would like it a little more prominent ...
also just a suggestion ...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]