[jira] [Updated] (SPARK-17233) Shuffle file will be left over the capacity of disk when dynamic schedule is enabled in a long running case.

carlmartin (JIRA) Thu, 25 Aug 2016 05:07:43 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


carlmartin updated SPARK-17233:
-------------------------------
    Summary: Shuffle file will be left over the capacity of disk when dynamic 
schedule is enabled in a long running case.  (was: Shuffle file will be left 
over the capacity when dynamic schedule is enabled in a long running case.)

> Shuffle file will be left over the capacity of disk when dynamic schedule is 
> enabled in a long running case.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17233
>                 URL: https://issues.apache.org/jira/browse/SPARK-17233
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2, 1.6.2, 2.0.0
>            Reporter: carlmartin
>
> When I execute some sql statement periodically in the long running 
> thriftserver, I found the disk device will be full after about one week later.
> After check the file on linux, I found so many shuffle files left on the 
> block-mgr dir whose shuffle stage had finished long time ago.
> Finally I find when it's need to clean shuffle file, driver will total each 
> executor to do the ShuffleClean. But when dynamic schedule is enabled, 
> executor will be down itself and executor can't clean its shuffle file, then 
> file was left.
> I test it in Spark 1.5 but master branch must have this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-17233) Shuffle file will be left over the capacity of disk when dynamic schedule is enabled in a long running case.

Reply via email to