[ 
https://issues.apache.org/jira/browse/SPARK-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

carlmartin updated SPARK-17233:
-------------------------------
    Description: 
When I execute some sql statement periodically in the long running 
thriftserver, I found the disk device will be full after about one week later.
After check the file on linux, I found so many shuffle files left on the 
block-mgr dir whose shuffle stage had finished long time ago.
Finally I find when it's need to clean shuffle file, driver will total each 
executor to do the ShuffleClean. But when dynamic schedule is enabled, executor 
will be down itself and executor can't clean its shuffle file, then file was 
left.

I test it in Spark 1.5 but master branch must have this issue.



  was:
When I execute some sql statement periodically in the long running 
thriftserver, I found the disk device will be full after about one week later.
After check the file on linux, I found so many shuffle files left on the 
block-mgr dir whose shuffle stage had finished long time ago.
Finally I find when it's need to clean shuffle file, driver will total each 
executor to do the ShuffleClean. But when dynamic schedule is enabled, executor 
will be down itself and executor can't clean its shuffle file, then file was 
left.




> Shuffle file will be left over the capacity when dynamic schedule is enabled 
> in a long running case.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17233
>                 URL: https://issues.apache.org/jira/browse/SPARK-17233
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2, 1.6.2, 2.0.0
>            Reporter: carlmartin
>
> When I execute some sql statement periodically in the long running 
> thriftserver, I found the disk device will be full after about one week later.
> After check the file on linux, I found so many shuffle files left on the 
> block-mgr dir whose shuffle stage had finished long time ago.
> Finally I find when it's need to clean shuffle file, driver will total each 
> executor to do the ShuffleClean. But when dynamic schedule is enabled, 
> executor will be down itself and executor can't clean its shuffle file, then 
> file was left.
> I test it in Spark 1.5 but master branch must have this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to