[ 
https://issues.apache.org/jira/browse/SPARK-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weizhong updated SPARK-5663:
----------------------------
    Description: 
As we know, in yarn mode Client will create appStagingDir on file system, and 
AppMaster will delete this appStagingDir when it exit. If file system is HDFS 
then it will work OK.

As we know, to run Spark on Tachyon will create a core-site.xml on 
${SPARK_HOME}/conf, so when load core-site.xml will read 
${SPARK_HOME}/conf/core-site.xml, and in ${SPARK_HOME}/conf/core-site.xml don't 
set fs.defaultFS, so we will get local file system. So in yarn mode Client will 
create appStagingDir on local file system, and if Client and AppMaster are not 
in the same node, then the appStagingDir will not be deleted.

To solve this issue, we can do:
1. add fs.defaultFS setting to ${SPARK_HOME}/conf/core-site.xml so that when 
get file system will return HDFS
2. or cleanup appStagingDir while Client exit or stop.

  was:
As we know, in yarn mode Client will create appStagingDir on file system, and 
AppMaster will delete this appStagingDir when it exit. If file system is HDFS 
then it will work OK.

But if we don't add HADOOP_CONF_DIR to classpath, then default file system is 
local file system(Use FileSystem.get(conf) to get fs). So in yarn mode Client 
will create appStagingDir on local file system, and if Client and AppMaster are 
not in the same node, then the appStagingDir will not be deleted.


> Delete appStagingDir on local file system
> -----------------------------------------
>
>                 Key: SPARK-5663
>                 URL: https://issues.apache.org/jira/browse/SPARK-5663
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Weizhong
>            Priority: Minor
>
> As we know, in yarn mode Client will create appStagingDir on file system, and 
> AppMaster will delete this appStagingDir when it exit. If file system is HDFS 
> then it will work OK.
> As we know, to run Spark on Tachyon will create a core-site.xml on 
> ${SPARK_HOME}/conf, so when load core-site.xml will read 
> ${SPARK_HOME}/conf/core-site.xml, and in ${SPARK_HOME}/conf/core-site.xml 
> don't set fs.defaultFS, so we will get local file system. So in yarn mode 
> Client will create appStagingDir on local file system, and if Client and 
> AppMaster are not in the same node, then the appStagingDir will not be 
> deleted.
> To solve this issue, we can do:
> 1. add fs.defaultFS setting to ${SPARK_HOME}/conf/core-site.xml so that when 
> get file system will return HDFS
> 2. or cleanup appStagingDir while Client exit or stop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to