I have a standalone master cluster with 512Mb for master and 12 g for each worker. I start spark-shell with 10g and spark.task.maxFailures=99999
I run a job that plans 3105 tasks. 3104 tasks out of 3105 tasks run OK, then the tasks start failing and after 36K failed tasks, I get following. Is the master running out of memory ? Exception in thread "DAGScheduler" java.lang.OutOfMemoryError: GC overhead limit exceeded Jan 23, 2014 6:33:36 PM org.jboss.netty.channel.socket.nio.AbstractNioWorker WARNING: Unexpected exception in the selector loop. java.lang.OutOfMemoryError: GC overhead limit exceeded Uncaught error from thread [spark-11] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[spark] java.lang.OutOfMemoryError: GC overhead limit exceeded Jan 23, 2014 6:36:43 PM org.jboss.netty.channel.socket.nio.AbstractNioWorker WARNING: Unexpected exception in the selector loop. java.lang.OutOfMemoryError: GC overhead limit exceeded Jan 23, 2014 6:36:36 PM org.jboss.netty.channel.socket.nio.AbstractNioWorker WARNING: Unexpected exception in the selector loop. java.lang.OutOfMemoryError: GC overhead limit exceeded
