[ 
https://issues.apache.org/jira/browse/SPARK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725657#comment-14725657
 ] 

Iulian Dragos commented on SPARK-9708:
--------------------------------------

I'm not sure if this is the entire story. Remember that shuffle files need to 
survive the executor when dynamic allocation is enabled. So, with this proposed 
change, if the scheduler decides to kill an executor its shuffle files will be 
gone and the external shuffle server won't be able to find them anymore. At 
least shuffle files need to go on another directory, not under the sandbox.

Also, Spark allows one to configure `spark.local.dir`, and that should take 
precedence. In the Hadoop world, this can be used to specify several 
directories on different physical disks (to allow fast parallel writes).

> Spark should create local temporary directories in Mesos sandbox when 
> launched with Mesos
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-9708
>                 URL: https://issues.apache.org/jira/browse/SPARK-9708
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>            Reporter: Timothy Chen
>
> Currently Spark creates temporary directories with 
> Utils.getConfiguredLocalDirs, and it writes to YARN directories if YARN is 
> detected, otherwise just writes in a temporary directory in the host.
> However, Mesos does create a directory per task and ideally Spark should use 
> that directory to create its local temporary directories since it then can be 
> cleaned up when the task is gone and not left on the host or cleaned until 
> reboot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to