[GitHub] spark pull request #16499: [SPARK-17204][CORE] Fix replicated off heap stora...

mallman Mon, 20 Mar 2017 15:06:07 -0700

Github user mallman commented on a diff in the pull request:

https://github.com/apache/spark/pull/16499#discussion_r107028767

--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1048,7 +1065,7 @@ private[spark] class BlockManager(
try {
replicate(blockId, bytesToReplicate, level, remoteClassTag)
} finally {
- bytesToReplicate.dispose()
+ bytesToReplicate.unmap()
--- End diff --

The best I think we can expect from such a flag is a hint. The constructor
of a `ChunkedByteBuffer` will not always know if the underlying byte buffers
are memory mapped or not. For example, see
https://github.com/apache/spark/blob/bec6b16c1900fe93def89cc5eb51cbef498196cb/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L326.

In this case, `data.nioByteBuffer()` might or might not be memory-mapped.

I still think the current patch set is the best overall amongst the other
options we've considered. I can add a unit test for `StorageUtils.unmap` to
ensure it works as expected (only disposing memory-mapped buffers). I can also
add an `if` clause around the call to `bytesToReplicate.unmap()` to ensure this
is only called when the replication storage level is off-heap. This will ensure
the reflective call on the `fd` field only occurs for off-heap replication.
Given that off-heap replication is currently broken, I doubt anyone will notice
a performance degradation... Besides that, I suspect that network and disk IO
performance will dominate the reflective method call performance.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16499: [SPARK-17204][CORE] Fix replicated off heap stora...

Reply via email to