GitHub user caneGuy opened a pull request: https://github.com/apache/spark/pull/20292
[SPARK-23129][CORE] Make deserializeStream of DiskMapIterator init lazily ## What changes were proposed in this pull request? Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init when DiskMapIterator instance created.This will cause memory use overhead when ExternalAppendOnlyMap spill too much times. We can avoid this by making deserializeStream init when it is used the first time. This patch make deserializeStream init lazily. ## How was this patch tested? Exist tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/caneGuy/spark zhoukang/lay-diskmapiterator Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20292.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20292 ---- commit d2bbbe1677202ae73046f12573d96a07e3deeb31 Author: zhoukang <zhoukang199191@...> Date: 2018-01-17T09:33:07Z [SPARK][CORE] Make deserializeStream of DiskMapIterator init lazily ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org