Github user squito commented on the pull request:
https://github.com/apache/spark/pull/5400#issuecomment-106919325
Thanks for the renewed interest. I'd like to also reviewers to consider
one aspect of the design that I went back and forth on a lot. Currently,
`LargeByteBuffer` has a `asByteBuffer` method -- that's so that if some parts
of the code don't support large blocks (eg., replication & shuffles), we can
still fall back on the existing code that uses ByteBuffers when possible.
However, that leads to some inefficiencies -- eg., `WrappedLargeByteBuffer`
is forced to always create chunks as large as possible. That forces allocating
big byte arrays, even though otherwise the implementation would be happy with
multiple modestly sized ones. Similarly, `LargeByteBufferOutputStream` is
forced to do extra copying to go from its smaller buffers to one large one. I
don't see any good alternatives, but I'd like other opinions.
I need to refresh myself a little on how this fits into the other changes I
have for using this to cache large blocks, but I will work on making all the
updates suggested here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]