Spark version 1.4.0 in the Standalone mode 2015-07-09 20:12:02 INFO (sparkDriver-akka.actor.default-dispatcher-3) BlockManagerInfo:59 - Added rdd_0_0 on disk on localhost:51132 (size: 29.8 GB) 2015-07-09 20:12:02 ERROR (Executor task launch worker-0) Executor:96 - Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:836) at org.apache.spark.storage.DiskStore$$anonfun$getBytes$2.apply(DiskStore.scala:125) at org.apache.spark.storage.DiskStore$$anonfun$getBytes$2.apply(DiskStore.scala:113) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1285) at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:127) at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:134) at org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:509) at org.apache.spark.storage.BlockManager.getLocal(BlockManager.scala:427) at org.apache.spark.storage.BlockManager.get(BlockManager.scala:615) at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:154) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) at org.apache.spark.rdd.RDD.iterator(RDD.scala:242) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) at org.apache.spark.rdd.RDD.iterator(RDD.scala:242) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69) at org.apache.spark.rdd.RDD.iterator(RDD.scala:242) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
On 9 July 2015 at 18:11, Ted Yu <yuzhih...@gmail.com> wrote: > Which release of Spark are you using ? > > Can you show the complete stack trace ? > > getBytes() could be called from: > getBytes(file, 0, file.length) > or: > getBytes(segment.file, segment.offset, segment.length) > > Cheers > > On Thu, Jul 9, 2015 at 2:50 PM, Michal Čizmazia <mici...@gmail.com> wrote: > >> Please could anyone give me pointers for appropriate SparkConf to work >> around "Size exceeds Integer.MAX_VALUE"? >> >> Stacktrace: >> >> 2015-07-09 20:12:02 INFO (sparkDriver-akka.actor.default-dispatcher-3) >> BlockManagerInfo:59 - Added rdd_0_0 on disk on localhost:51132 (size: 29.8 >> GB) >> 2015-07-09 20:12:02 ERROR (Executor task launch worker-0) Executor:96 - >> Exception in task 0.0 in stage 0.0 (TID 0) >> java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE >> at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:836) >> at >> org.apache.spark.storage.DiskStore$$anonfun$getBytes$2.apply(DiskStore.scala:125) >> ... >> >> >