Hi Dan,

Spark will clean up the temp files after a run (IIRC), so you won’t see the 
drive out of space after the run completes. In any case, by default, Spark puts 
shuffles files at /tmp/ (this is controlled by the spark.local.dir parameter). 
I assume you’re running on EC2? You’ll probably want to override 
spark.local.dir to one of the /mnt*/ drives, which have much more empty space 
than the default shuffle directory.

Regards,

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466

On Aug 27, 2014, at 4:12 PM, Daniil Osipov <daniil.osi...@shazam.com> wrote:

> Hello,
> 
> I've been seeing the following errors when trying to save to S3:
> 
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage fail
> ure: Task 4058 in stage 2.1 failed 4 times, most recent failure: Lost task 
> 4058.3 in stag
> e 2.1 (TID 12572, ip-10-81-151-40.ec2.internal): 
> java.io.FileNotFoundException: /mnt/spa$
> k/spark-local-20140827191008-05ae/0c/shuffle_1_7570_5768 (No space left on 
> device)
>         java.io.FileOutputStream.open(Native Method)
>         java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:107)
>         
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:175$
>         
> org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuff$
> eWriter.scala:67)
>         
> org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuff$
> eWriter.scala:65)
>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>         
> org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65$
>         
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         org.apache.spark.scheduler.Task.run(Task.scala:54)
>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>         
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
> 
> DF tells me there is plenty of space left on the worker node:
> root@ip-10-81-151-40 ~]$ df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/xvda1            7.9G  4.6G  3.3G  59% /
> tmpfs                 7.4G     0  7.4G   0% /dev/shm
> /dev/xvdb              37G   11G   25G  30% /mnt
> /dev/xvdf              37G  9.5G   26G  27% /mnt2
> 
> Any suggestions?
> Dan

Reply via email to