Hello Team, While arrow-format/flatbuffers does support zero copy, current arrow-vector java implementation is not. I was trying to hack zero copy for readonly scenarios, but saw two main blockers:
1. ArrowBuf is the only buffer implementation used exclusively across ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there isn't a way for me to override its logic in order to wrap some existing buffer. It's absolutely necessary to use ArrowBuf for write scenarios, but for read, if I already have data in ByteBuffer or netty ByteBuf, I hope to avoid the copy that creates ArrowBuf out of them (I was pretty much referring to deserialization logic inside MessageSerializer). 2. As a result of #1 described above, the only layer which seems reusable is the arrow-format. Then I have to implement effectively a readonly copy of arrow-vector that references existing buffer. Put aside the effort doing that, it introduces a big gap to keep up with future changes/fixes made to arrow-vector. Wondering if you guys have put any thoughts into such readonly scenarios. Any suggestion how I can approach this myself? Thanks. [ Full content available at: https://github.com/apache/arrow/issues/2516 ] This message was relayed via gitbox.apache.org for [email protected]
