Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/397#issuecomment-40279192
You could deprecate and override `toByteArray` to throw an exception, etc.,
to be extra-safe. They "work", the result just may not have much meaning
independently. Your class still has methods like `close()` either way. Dunno,
still seems simpler than the duplication.
What's the compaction for? If you've got a series of ~2GB containers, I'd
assume you'd fill them each pretty completely and transparently split a big
write across the existing and next buffer. It saves a huge allocation, which
could fail.
(In the grow() method, you would have to check that the new doubled size
hasn't overflowed!)
I agree with use of `ByteBuffer`, but suppose I'm pointing out that it has
to get used in several other places in the code that use `byte[]` right now in
order to get the benefit. I understand that wasn't the direct purpose of the
code you're working on, but is the purpose of this PR I think. In which case,
perhaps better to leverage your direction.
A simpler step in your direction could be the basis for the change that
this PR is trying for. That's why I wonder if this piece could have a simpler,
stand-alone purpose.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---