Github user ConeyLiu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19285#discussion_r163462053
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -233,17 +235,13 @@ private[spark] class MemoryStore(
}
if (keepUnrolling) {
- // We successfully unrolled the entirety of this block
- val arrayValues = vector.toArray
- vector = null
- val entry =
- new DeserializedMemoryEntry[T](arrayValues,
SizeEstimator.estimate(arrayValues), classTag)
- val size = entry.size
+ // We need more precise value
+ val size = valuesHolder.esitimatedSize(false)
--- End diff --
I change the code back to originally. For `DeserializedValuesHolder`, we
could `buildEntry` and get the `size` from `MemorySize`. But for
`SerializedValuesHolder`, this way not work correctly. Because we need call the
`bbos.toChunkedByteBuffer` to get the `MemoryEntry` object, and if the reserved
memory is not enough for transfer the unroll memory to storage memory. Then we
unroll failed and need call `bbos.toChunkedByteBuffer`
([L802](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala#L802),
this should be intentional which related to #15043). So the problem is that we
call `bbos.toChunkedByteBuffer` twice, but it can't be.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]