[ https://issues.apache.org/jira/browse/SPARK-24143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460548#comment-16460548 ]
Apache Spark commented on SPARK-24143: -------------------------------------- User 'jinxing64' has created a pull request for this issue: https://github.com/apache/spark/pull/21212 > filter empty blocks when convert mapstatus to (blockId, size) pair > ------------------------------------------------------------------ > > Key: SPARK-24143 > URL: https://issues.apache.org/jira/browse/SPARK-24143 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.3.0 > Reporter: jin xing > Priority: Major > > In current code(MapOutputTracker.convertMapStatuses), mapstatus are converted > to (blockId, size) pair for all blocks -- no matter the block is empty or > not, which result in OOM when there are lots of consecutive empty blocks, > especially when adaptive execution is enabled. > (blockId, size) pair is only used in ShuffleBlockFetcherIterator to control > shuffle-read and only non-empty block request is sent. Can we just filter out > the empty blocks in MapOutputTracker.convertMapStatuses and save memory? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org