[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r140408246 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -387,11 +387,18 @@ private[spark] class MemoryStore( // the block's actual memory usage has exceeded the unroll memory by a small amount, so we // perform one final call to attempt to allocate additional memory if necessary. if (keepUnrolling) { - serializationStream.close() - reserveAdditionalMemoryIfNecessary() + serializationStream.flush() + if (bbos.size > unrollMemoryUsedByThisBlock) { +val amountToRequest = bbos.size - unrollMemoryUsedByThisBlock +keepUnrolling = reserveUnrollMemoryForThisTask(blockId, amountToRequest, memoryMode) +if (keepUnrolling) { + unrollMemoryUsedByThisBlock += amountToRequest +} + } } if (keepUnrolling) { + serializationStream.close() --- End diff -- Here, we should close the `serializationStream` after we check it again. Previous code we close it first, and then request the exceed memory. So there is a potential problem that we can't request enought memory, while the `serializationStream` is closeed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r140408116 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -387,11 +387,18 @@ private[spark] class MemoryStore( // the block's actual memory usage has exceeded the unroll memory by a small amount, so we // perform one final call to attempt to allocate additional memory if necessary. if (keepUnrolling) { - serializationStream.close() - reserveAdditionalMemoryIfNecessary() + serializationStream.flush() + if (bbos.size > unrollMemoryUsedByThisBlock) { +val amountToRequest = bbos.size - unrollMemoryUsedByThisBlock --- End diff -- Here, we only need request the `bbos.size - unrollMemoryUsedByThisBlock`. I'm sorry, this mistake maybe introduced by my previous patch [SPARK-21923](https://github.com/apache/spark/pull/19135). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19316 [SPARK-22097][CORE]Call serializationStream.close after we requested enough memory ## What changes were proposed in this pull request? Current code, we close the `serializationStream` after we unrolled the block. However, there is a otential problem that the size of underlying vector or stream maybe larger the memory we requested. So here, we need check it agin carefully. ## How was this patch tested? Existing UT. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ConeyLiu/spark putIteratorAsBytes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19316.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19316 commit bfe162e3aad300414dcc3fe25a3d70025e1795dd Author: Xianyang LiuDate: 2017-09-22T03:29:39Z close the serializationStream after check the memory request --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org