[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19316#discussion_r140408246
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -387,11 +387,18 @@ private[spark] class MemoryStore(
 // the block's actual memory usage has exceeded the unroll memory by a 
small amount, so we
 // perform one final call to attempt to allocate additional memory if 
necessary.
 if (keepUnrolling) {
-  serializationStream.close()
-  reserveAdditionalMemoryIfNecessary()
+  serializationStream.flush()
+  if (bbos.size > unrollMemoryUsedByThisBlock) {
+val amountToRequest = bbos.size - unrollMemoryUsedByThisBlock
+keepUnrolling = reserveUnrollMemoryForThisTask(blockId, 
amountToRequest, memoryMode)
+if (keepUnrolling) {
+  unrollMemoryUsedByThisBlock += amountToRequest
+}
+  }
 }
 
 if (keepUnrolling) {
+  serializationStream.close()
--- End diff --

Here, we should close the `serializationStream` after we check it again. 
Previous code we close it first, and then request the exceed memory. So there 
is a potential problem that we can't request enought memory, while the 
`serializationStream` is closeed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19316#discussion_r140408116
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -387,11 +387,18 @@ private[spark] class MemoryStore(
 // the block's actual memory usage has exceeded the unroll memory by a 
small amount, so we
 // perform one final call to attempt to allocate additional memory if 
necessary.
 if (keepUnrolling) {
-  serializationStream.close()
-  reserveAdditionalMemoryIfNecessary()
+  serializationStream.flush()
+  if (bbos.size > unrollMemoryUsedByThisBlock) {
+val amountToRequest = bbos.size - unrollMemoryUsedByThisBlock
--- End diff --

Here, we only need request the `bbos.size - unrollMemoryUsedByThisBlock`. 
I'm sorry, this mistake maybe introduced by my previous patch 
[SPARK-21923](https://github.com/apache/spark/pull/19135).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request:

https://github.com/apache/spark/pull/19316

[SPARK-22097][CORE]Call serializationStream.close after we requested enough 
memory

## What changes were proposed in this pull request?


Current code, we close the `serializationStream` after we unrolled the 
block. However, there is a otential problem that the size of underlying vector 
or stream maybe larger the memory we requested. So here, we need check it agin 
carefully.

## How was this patch tested?

Existing UT.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ConeyLiu/spark putIteratorAsBytes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19316.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19316


commit bfe162e3aad300414dcc3fe25a3d70025e1795dd
Author: Xianyang Liu 
Date:   2017-09-22T03:29:39Z

close the serializationStream after check the memory request




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org