Hello Team, I'm working on using arrow as intermediate format for transferring columnar data from server to client. In this case, the client will only need to read from the format so I would like to avoid any unnecessary copy of the data. Looking into arrow, while arrow-format/flatbuffers does support zero copy, current arrow-vector java implementation is not. I was trying to hack zero copy for readonly scenarios, but saw two main blockers:
1. ArrowBuf is the only buffer implementation used exclusively across ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there isn't a way for me to override its logic in order to wrap some existing buffer. It's absolutely necessary to use ArrowBuf for write scenarios due to buffer allocation, but for read, I was hoping vector can just serve as view on top of existing memory buffer (like java ByteBuffer or netty ByteBuf). Seems safe for read only case. 2. As a result of #1 <https://github.com/apache/arrow/pull/1> described above, the only layer which seems reusable is the arrow-format. Then I have to implement effectively a readonly copy of arrow-vector that references existing buffer. Put aside the effort doing that, it introduces a big gap to keep up with future changes/fixes made to arrow-vector. Wondering if you guys have put any thoughts into such readonly scenarios. Any suggestion how I can approach this myself? Thanks
