Re: Spilled shuffle files not being cleared

Michael Chang Fri, 13 Jun 2014 11:23:17 -0700

Thanks Saisai, I think I will just try lowering my spark.cleaner.ttl value
- I've set it to an hour.



On Thu, Jun 12, 2014 at 7:32 PM, Shao, Saisai <saisai.s...@intel.com> wrote:

>  Hi Michael,
>
>
>
> I think you can set up spark.cleaner.ttl=xxx to enable time-based metadata
> cleaner, which will clean old un-used shuffle data when it is timeout.
>
>
>
> For Spark 1.0 another way is to clean shuffle data using weak reference
> (reference tracking based, configuration is
> spark.cleaner.referenceTracking), and it is enabled by default.
>
>
>
> Thanks
>
> Saisai
>
>
>
> *From:* Michael Chang [mailto:m...@tellapart.com]
> *Sent:* Friday, June 13, 2014 10:15 AM
> *To:* user@spark.apache.org
> *Subject:* Re: Spilled shuffle files not being cleared
>
>
>
> Bump
>
>
>
> On Mon, Jun 9, 2014 at 3:22 PM, Michael Chang <m...@tellapart.com> wrote:
>
> Hi all,
>
>
>
> I'm seeing exceptions that look like the below in Spark 0.9.1.  It looks
> like I'm running out of inodes on my machines (I have around 300k each in a
> 12 machine cluster).  I took a quick look and I'm seeing some shuffle spill
> files that are around even around 12 minutes after they are created.  Can
> someone help me understand when these shuffle spill files should be cleaned
> up (Is it as soon as they are used?)
>
>
>
> Thanks,
>
> Michael
>
>
>
>
>
> java.io.FileNotFoundException:
> /mnt/var/hadoop/1/yarn/local/usercache/ubuntu/appcache/application_1399886706975_13107/spark-local-20140609210947-19e1/1c/shuffle_41637_3_0
> (No space left on device)
>
>         at java.io.FileOutputStream.open(Native Method)
>
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>
>         at
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:118)
>
>         at
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:179)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>
>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>
>         at org.apache.spark.scheduler.Task.run(Task.scala:53)
>
>         at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>
>         at
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
>
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:744)
>
> 14/06/09 22:07:36 WARN TaskSetManager: Lost TID 667432 (task 86909.0:7)
>
> 14/06/09 22:07:36 WARN TaskSetManager: Loss was due to
> java.io.FileNotFoundException
>
>
>

Re: Spilled shuffle files not being cleared

Reply via email to