[ 
https://issues.apache.org/jira/browse/SPARK-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Dudziak updated SPARK-5836:
----------------------------------
    Summary: Highlight in Spark documentation that by default Spark does not 
delete its temporary files  (was: Highlight in Spark documentation that by 
default it does not delete its temporary files)

> Highlight in Spark documentation that by default Spark does not delete its 
> temporary files
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-5836
>                 URL: https://issues.apache.org/jira/browse/SPARK-5836
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Tomasz Dudziak
>
> We recently learnt the hard way (in a prod system) that Spark by default does 
> not delete its temporary files until it is stopped. WIthin a relatively short 
> time span of heavy Spark use the disk of our prod machine filled up 
> completely because of multiple shuffle files written to it. We think there 
> should be better documentation around the fact that after a job is finished 
> it leaves a lot of rubbish behind so that this does not come as a surprise.
> Probably a good place to highlight that fact would be the documentation of 
> {{spark.local.dir}} property, which controls where Spark temporary files are 
> written. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to