[
https://issues.apache.org/jira/browse/SPARK-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
carlmartin updated SPARK-17233:
-------------------------------
Summary: Shuffle file will be left over the capacity of disk when dynamic
schedule is enabled in a long running case. (was: Shuffle file will be left
over the capacity when dynamic schedule is enabled in a long running case.)
> Shuffle file will be left over the capacity of disk when dynamic schedule is
> enabled in a long running case.
> ------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-17233
> URL: https://issues.apache.org/jira/browse/SPARK-17233
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.5.2, 1.6.2, 2.0.0
> Reporter: carlmartin
>
> When I execute some sql statement periodically in the long running
> thriftserver, I found the disk device will be full after about one week later.
> After check the file on linux, I found so many shuffle files left on the
> block-mgr dir whose shuffle stage had finished long time ago.
> Finally I find when it's need to clean shuffle file, driver will total each
> executor to do the ShuffleClean. But when dynamic schedule is enabled,
> executor will be down itself and executor can't clean its shuffle file, then
> file was left.
> I test it in Spark 1.5 but master branch must have this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]