Bump
On Mon, Jun 9, 2014 at 3:22 PM, Michael Chang <m...@tellapart.com> wrote: > Hi all, > > I'm seeing exceptions that look like the below in Spark 0.9.1. It looks > like I'm running out of inodes on my machines (I have around 300k each in a > 12 machine cluster). I took a quick look and I'm seeing some shuffle spill > files that are around even around 12 minutes after they are created. Can > someone help me understand when these shuffle spill files should be cleaned > up (Is it as soon as they are used?) > > Thanks, > Michael > > > java.io.FileNotFoundException: > /mnt/var/hadoop/1/yarn/local/usercache/ubuntu/appcache/application_1399886706975_13107/spark-local-20140609210947-19e1/1c/shuffle_41637_3_0 > (No space left on device) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:221) > at > org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:118) > at > org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:179) > at > org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164) > at > org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) > at org.apache.spark.scheduler.Task.run(Task.scala:53) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 14/06/09 22:07:36 WARN TaskSetManager: Lost TID 667432 (task 86909.0:7) > 14/06/09 22:07:36 WARN TaskSetManager: Loss was due to > java.io.FileNotFoundException >