I'm running PageRank on datasets with different sizes (from 1GB to 100GB).
Sometime my job is aborted showing this error:
Job aborted due to stage failure: Task 0 in stage 4.1 failed 4 times,
most recent failure: Lost task 0.3 in stage 4.1 (TID 2051,
9.12.247.250): java.io.FileNotFoundException:
/tmp/spark-23fcf792-e281-4d0d-a55e-4db2c31e7f10/executor-d4a67fea-3d6b-4e93-ad92-53221fa92f2b/blockmgr-96e4bd04-00ba-4893-a65a-cec09ec2dc52/17/rdd_3_0
(No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:110)
at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:134)
at
org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:510)
at
org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:427)
at org.apache.spark.storage.BlockManager.get(BlockManager.scala:616)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:44)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
I can not find the cause of this, can somone help?
I'm running Spark 1.4.0-rc3.