Hello,

I've been seeing the following errors when trying to save to S3:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage fail
ure: Task 4058 in stage 2.1 failed 4 times, most recent failure: Lost task
4058.3 in stag
e 2.1 (TID 12572, ip-10-81-151-40.ec2.internal):
java.io.FileNotFoundException: /mnt/spa$
k/spark-local-20140827191008-05ae/0c/shuffle_1_7570_5768 (No space left on
device)
        java.io.FileOutputStream.open(Native Method)
        java.io.FileOutputStream.<init>(FileOutputStream.java:221)

org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:107)

org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:175$

org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuff$
eWriter.scala:67)

org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuff$
eWriter.scala:65)
        scala.collection.Iterator$class.foreach(Iterator.scala:727)
        scala.collection.AbstractIterator.foreach(Iterator.scala:1157)

org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65$

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        org.apache.spark.scheduler.Task.run(Task.scala:54)

org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:745)

DF tells me there is plenty of space left on the worker node:
root@ip-10-81-151-40 ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/xvda1            7.9G  4.6G  3.3G  59% /
tmpfs                 7.4G     0  7.4G   0% /dev/shm
/dev/xvdb              37G   11G   25G  30% /mnt
/dev/xvdf              37G  9.5G   26G  27% /mnt2

Any suggestions?
Dan

Reply via email to