[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

ConeyLiu Tue, 23 Jan 2018 22:25:12 -0800

Github user ConeyLiu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19285#discussion_r163462053
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
    @@ -233,17 +235,13 @@ private[spark] class MemoryStore(
         }
     
         if (keepUnrolling) {
    -      // We successfully unrolled the entirety of this block
    -      val arrayValues = vector.toArray
    -      vector = null
    -      val entry =
    -        new DeserializedMemoryEntry[T](arrayValues, 
SizeEstimator.estimate(arrayValues), classTag)
    -      val size = entry.size
    +      // We need more precise value
    +      val size = valuesHolder.esitimatedSize(false)
    --- End diff --
    
    I change the code back to originally. For `DeserializedValuesHolder`, we 
could `buildEntry` and get the `size` from `MemorySize`. But for 
`SerializedValuesHolder`, this way not work correctly. Because we need call the 
`bbos.toChunkedByteBuffer` to get the `MemoryEntry` object, and if the reserved 
memory is not enough for transfer the unroll memory to storage memory. Then we 
unroll failed and need call `bbos.toChunkedByteBuffer` 
([L802](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala#L802),
 this should be intentional which related to #15043). So the problem is that we 
call `bbos.toChunkedByteBuffer` twice, but it can't be.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

Reply via email to