Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/17295#discussion_r108046997
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -56,6 +57,49 @@ private[spark] class BlockResult(
val bytes: Long)
/**
+ * Abstracts away how blocks are stored and provides different ways to
read the underlying block
+ * data. Callers should call [[dispose()]] when they're done with the
block.
+ */
+private[spark] trait BlockData {
+
+ def toInputStream(): InputStream
+
+ /**
+ * Returns a Netty-friendly wrapper for the block's data.
+ *
+ * @see [[ManagedBuffer#convertToNetty()]]
+ */
+ def toNetty(): Object
+
+ def toChunkedByteBuffer(allocator: Int => ByteBuffer): ChunkedByteBuffer
+
+ def toByteBuffer(): ByteBuffer
+
+ def size: Long
+
+ def dispose(): Unit
+
+}
+
+private[spark] class ByteBufferBlockData(val buffer: ChunkedByteBuffer)
extends BlockData {
+
+ override def toInputStream(): InputStream = buffer.toInputStream(dispose
= false)
+
+ override def toNetty(): Object = buffer.toNetty
+
+ override def toChunkedByteBuffer(allocator: Int => ByteBuffer):
ChunkedByteBuffer = {
+ buffer.copy(allocator)
+ }
+
+ override def toByteBuffer(): ByteBuffer = buffer.toByteBuffer
+
+ override def size: Long = buffer.size
+
+ override def dispose(): Unit = buffer.unmap()
--- End diff --
BTW I'm really starting to think the fix in #16499, while technically
correct, is more confusing that it should be. The problem is not that the code
was disposing of off-heap buffers; the problem is that buffers read from the
memory store should not be disposed of, while buffers read from the disk store
should.
So it's not really a matter of dispose vs. unmap, but a matter of where the
buffer come from. (Which is kinda what I had in this patch with the
`autoDispose` parameter to `ByteBufferBlockData`. Perhaps I should revive that
and get rid of `StorageUtils.unmap`, which is just confusing.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]