Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/23058#discussion_r234779890
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -693,9 +693,9 @@ private[spark] class BlockManager(
*/
private def getRemoteValues[T: ClassTag](blockId: BlockId):
Option[BlockResult] = {
val ct = implicitly[ClassTag[T]]
- getRemoteBytes(blockId).map { data =>
+ getRemoteManagedBuffer(blockId).map { data =>
val values =
- serializerManager.dataDeserializeStream(blockId,
data.toInputStream(dispose = true))(ct)
+ serializerManager.dataDeserializeStream(blockId,
data.createInputStream())(ct)
--- End diff --
I'm pretty sure this is OK. If you're getting remote bytes, you're never
going to get a `BlockManagerManagedBuffer`. We should probably add that as an
assert, though, to make sure.
What still puzzles me, though, is why we there was ever a `dispose=true`
here. That is something I'd like to look at more still. I have a hard time
believing it could do anything useful, as its not guaranteed to get called in a
`finally` or a taskCompletionListener etc.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]