Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/23058#discussion_r234326643
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -693,9 +693,9 @@ private[spark] class BlockManager(
*/
private def getRemoteValues[T: ClassTag](blockId: BlockId):
Option[BlockResult] = {
val ct = implicitly[ClassTag[T]]
- getRemoteBytes(blockId).map { data =>
+ getRemoteManagedBuffer(blockId).map { data =>
val values =
- serializerManager.dataDeserializeStream(blockId,
data.toInputStream(dispose = true))(ct)
+ serializerManager.dataDeserializeStream(blockId,
data.createInputStream())(ct)
--- End diff --
I noticed one detail here we're going to have to think very carefully
about. We're losing `dispose = true`. One way we could still end up reading
from a ChunkedByteBuffer is if `data` is a `BlockManagerManagedBuffer`, and its
`data: BlockData` is a `ByteBufferBlockData`, which explicitly uses `dispose =
false`:
https://github.com/apache/spark/blob/2aef79a65a145b76a88f1d4d9367091fd238b949/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L90-L94
need to look through use a bit more, I think we might be fine since if
you're getting a remote block, it can't be a BlockManagerManagedBuffer anyway.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]