I have a dataset of about 10GB. I am using persist(DISK_ONLY) to avoid out
of memory issues when running my job.

When I run with a dataset of about 1 GB, the job is able to complete.

But when I run with the larger dataset of 10 GB, I get the following
error/stacktrace, which seems to be happening when the RDD is writing out
to disk.

Anyone have any ideas as to what is going on or if there is a setting I can
tune?


14/06/09 21:33:55 ERROR executor.Executor: Exception in task ID 560
java.io.FileNotFoundException:
/tmp/spark-local-20140609210741-0bb8/14/rdd_331_175 (No such file or
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
at java.io.FileOutputStream.<init>(FileOutputStream.java:160)
at org.apache.spark.storage.DiskStore.putValues(DiskStore.scala:79)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:698)
at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:95)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)

-- 

SUREN HIRAMAN, VP TECHNOLOGY
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR
NEW YORK, NY 10001
O: (917) 525-2466 ext. 105
F: 646.349.4063
E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
W: www.velos.io

Reply via email to