I'm running on Yarn with relatively small instances with 4gb memory. I'm not
caching any data but when the map stage ends and shuffling begins all of the
executors request the map output locations at the same time which seems to
kill the driver when the number of executors is turned up.

For example, the "size of output statuses" is about 10mb and with 500
executors the driver appears to be making 500 (5gb of data) copies of this
data to send out and running out of memory. When I turn down the number of
executors everything runs fine.

Has anyone else run into this? Maybe I'm misunderstanding the underlying
cause. I don't have a copy of the stack trace handy but can recreate it if
necessary. It was somewhere in the <init> for HeapByteBuffer. Any advice
would be helpful.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to