[ https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607255#comment-16607255 ]
harel gliksman commented on SPARK-1239: --------------------------------------- I am hitting a similar error on Spark 2.3.1 running on EMR cluster of 500 r3.2xlarge to process ~40 TB. There are ~ 200,000 map tasks that succeed. We use aggregateByKey with numPartition=300,000 but when the reduce sides tasks start the driver (which has 50GB memory) is failing with: 18/09/07 15:10:49 WARN Utils: Suppressing exception in finally: null java.lang.OutOfMemoryError at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211) at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145) at java.io.ObjectOutputStream$BlockDataOutputStream.writeBlockHeader(ObjectOutputStream.java:1894) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1875) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822) at java.io.ObjectOutputStream.flush(ObjectOutputStream.java:719) at java.io.ObjectOutputStream.close(ObjectOutputStream.java:740) at org.apache.spark.MapOutputTracker$$anonfun$serializeMapStatuses$2.apply$mcV$sp(MapOutputTracker.scala:790) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1389) at org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:789) at org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:174) at org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:397) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Exception in thread "map-output-dispatcher-0" java.lang.OutOfMemoryError at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211) at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145) at java.io.ObjectOutputStream$BlockDataOutputStream.writeBlockHeader(ObjectOutputStream.java:1894) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1875) at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1189) at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) at org.apache.spark.MapOutputTracker$$anonfun$serializeMapStatuses$1.apply$mcV$sp(MapOutputTracker.scala:787) at org.apache.spark.MapOutputTracker$$anonfun$serializeMapStatuses$1.apply(MapOutputTracker.scala:786) at org.apache.spark.MapOutputTracker$$anonfun$serializeMapStatuses$1.apply(MapOutputTracker.scala:786) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1380) at org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:789) at org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:174) at org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:397) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.OutOfMemoryError at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123) at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117) at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153) at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253) at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211) at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145) at java.io.ObjectOutputStream$BlockDataOutputStream.writeBlockHeader(ObjectOutputStream.java:1894) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1875) at java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822) at java.io.ObjectOutputStream.flush(ObjectOutputStream.java:719) at java.io.ObjectOutputStream.close(ObjectOutputStream.java:740) at org.apache.spark.MapOutputTracker$$anonfun$serializeMapStatuses$2.apply$mcV$sp(MapOutputTracker.scala:790) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1389) ... 6 more > Improve fetching of map output statuses > --------------------------------------- > > Key: SPARK-1239 > URL: https://issues.apache.org/jira/browse/SPARK-1239 > Project: Spark > Issue Type: Improvement > Components: Shuffle, Spark Core > Affects Versions: 1.0.2, 1.1.0 > Reporter: Patrick Wendell > Assignee: Thomas Graves > Priority: Major > Fix For: 2.0.0 > > > Instead we should modify the way we fetch map output statuses to take both a > mapper and a reducer - or we should just piggyback the statuses on each task. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org