Spark shuffling OutOfMemoryError Java heap space

Renyi Xiong Sun, 15 May 2016 11:47:09 -0700

Hi

I am consistently observing driver OutOfMemoryError (Java heap space)
during shuffling operation indicated by the log:


…………

16/05/14 21:57:03 INFO MapOutputTrackerMaster: Size of output statuses for
shuffle 2 is 36060250 bytes à shuffle metadata size is big and the full
metadata will be sent to all workers?

16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host1>:45757

16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host2>:20300

16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host3>:12389

16/05/14 21:57:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host4>:32197

…………

Exception in thread "dispatcher-event-loop-17" Exception in thread
"dispatcher-event-loop-3" Exception in thread "dispatcher-event-loop-6"
16/05/14 21:59:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map
output locations for shuffle 2 to <host5>:19639

Exception in thread "dispatcher-event-loop-21" 16/05/14 21:59:08 INFO
MapOutputTrackerMasterEndpoint: Asked to send map output locations for
shuffle 2 to <host6>:58461

Exception in thread "dispatcher-event-loop-20" Exception in thread
"dispatcher-event-loop-13" Exception in thread
"dispatcher-event-loop-9" java.lang.OutOfMemoryError:
Java heap space

java.lang.OutOfMemoryError: Java heap space

                at java.util.Arrays.copyOf(Arrays.java:2271)

                at
java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)

                at org.apache.spark.serializer.
JavaSerializerInstance.serialize(JavaSerializer.scala:103) à shuffle
metadata duplicated (?) when sending to each executor?

                at
org.apache.spark.rpc.netty.NettyRpcEnv.serialize(NettyRpcEnv.scala:252)

                at
org.apache.spark.rpc.netty.RemoteNettyRpcCallContext.send(NettyRpcCallContext.scala:64)

                at
org.apache.spark.rpc.netty.NettyRpcCallContext.reply(NettyRpcCallContext.scala:32)

                at
org.apache.spark.MapOutputTrackerMasterEndpoint$$anonfun$receiveAndReply$1.applyOrElse(MapOutputTracker.scala:62)

                at
org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:104)

                at
org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)

                at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)

                at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)

                at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                at java.lang.Thread.run(Thread.java:724)

I enable memory dump and used jhat to analyze. In heap histogram, I found
146 byte array objects with exact same size of 36,060,293 bytes.

I wonder if the 146 big objects are actually duplicates of the same shuffle
metadata, *can experts please help understand if it's true?*

(8G driver memory was specified for the above run, should be sufficient for
the 36M shuffle metadata. but probably not for 146 duplicates)

thanks,
Renyi.

Spark shuffling OutOfMemoryError Java heap space

Reply via email to