[
https://issues.apache.org/jira/browse/SPARK-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christophe PRÉAUD updated SPARK-6469:
-------------------------------------
Description:
According to the [Spark YARN doc
page|http://spark.apache.org/docs/latest/running-on-yarn.html#important-notes],
Spark executors will use the local directories configured for YARN, not
{{spark.local.dir}} which should be ignored.
It should be noted though that in yarn-client mode, though the executors will
indeed use the local directories configured for YARN, the driver will not,
because it is not running on the YARN cluster; the driver in yarn-client will
use the local directories defined in {{spark.local.dir}}
was:
According to the [Spark YARN doc
page|http://spark.apache.org/docs/latest/running-on-yarn.html#important-notes],
Spark executors will use the local directories configured for YARN, not
spark.local.dir which should be ignored.
If this works correctly in yarn-cluster mode, I've found out that it is not the
case in yarn-client mode.
The problem seems to originate in the method
[isRunningInYarnContainer|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L686].
Indeed, I've checked with a simple application that the {{CONTAINER_ID}}
environment variable is correctly set in yarn-cluster mode (to something like
{{container_1426666761810_0151_01_000001}}, but not in yarn-client mode.
> Local directories configured for YARN are not used in yarn-client mode
> ----------------------------------------------------------------------
>
> Key: SPARK-6469
> URL: https://issues.apache.org/jira/browse/SPARK-6469
> Project: Spark
> Issue Type: Documentation
> Components: Spark Core
> Reporter: Christophe PRÉAUD
> Priority: Minor
> Attachments: TestYarnVars.scala
>
>
> According to the [Spark YARN doc
> page|http://spark.apache.org/docs/latest/running-on-yarn.html#important-notes],
> Spark executors will use the local directories configured for YARN, not
> {{spark.local.dir}} which should be ignored.
> It should be noted though that in yarn-client mode, though the executors will
> indeed use the local directories configured for YARN, the driver will not,
> because it is not running on the YARN cluster; the driver in yarn-client will
> use the local directories defined in {{spark.local.dir}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]