It cleans the work dir, and SPARK_LOCAL_DIRS should be cleaned automatically. From the source code comments: // SPARK_LOCAL_DIRS environment variable, and deleted by the Worker when the // application finishes.
> On 13.04.2015, at 11:26, Guillaume Pitel <guillaume.pi...@exensa.com> wrote: > > Does it also cleanup spark local dirs ? I thought it was only cleaning > $SPARK_HOME/work/ > > Guillaume >> I have set SPARK_WORKER_OPTS in spark-env.sh for that. For example: >> >> export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true >> -Dspark.worker.cleanup.appDataTtl=<seconds>" >> >>> On 11.04.2015, at 00:01, Wang, Ningjun (LNG-NPV) >>> <ningjun.w...@lexisnexis.com <mailto:ningjun.w...@lexisnexis.com>> wrote: >>> >>> Does anybody have an answer for this? >>> >>> Thanks >>> Ningjun >>> >>> From: Wang, Ningjun (LNG-NPV) >>> Sent: Thursday, April 02, 2015 12:14 PM >>> To: user@spark.apache.org <mailto:user@spark.apache.org> >>> Subject: Is the disk space in SPARK_LOCAL_DIRS cleanned up? >>> >>> I set SPARK_LOCAL_DIRS to C:\temp\spark-temp. When RDDs are shuffled, >>> spark writes to this folder. I found that the disk space of this folder >>> keep on increase quickly and at certain point I will run out of disk space. >>> >>> I wonder does spark clean up the disk space in this folder once the shuffle >>> operation is done? If not, I need to write a job to clean it up myself. But >>> how do I know which sub folders there can be removed? >>> >>> Ningjun >> > > > -- > <exensa_logo_mail.png> > Guillaume PITEL, Président > +33(0)626 222 431 > > eXenSa S.A.S. <http://www.exensa.com/> > 41, rue Périer - 92120 Montrouge - FRANCE > Tel +33(0)184 163 677 / Fax +33(0)972 283 705