Interesting, so basically I can still use the public constructor public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger, UnsafeDirectLittleEndian byteBuf, BufferManager manager, ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty)
Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager to make it reference existing buffer. That is a much more plausible option as it will reuse the Vectors. All I need is to implement my own deserializer. Did I get you right? Thanks On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <[email protected]> wrote: > It is on purpose that the ArrowBuf is final. It is done to ensure a single > impl and performance reasons. ArrowBuf is primarily a memory address and a > length and wants zero indirection to the reading/writing of that. > > It does, however, wrap several types of substructures as long as they have > that property. For example, an ArrowBuf almost always currently wraps a > Netty UnsafeDirectLittleEndian object. At that level you could propose a > way to wrap more types of memory addresses+lengths. > > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <[email protected]> wrote: > > > Hello Team, > > > > I'm working on using arrow as intermediate format for transferring > columnar > > data from server to client. In this case, the client will only need to > read > > from the format so I would like to avoid any unnecessary copy of the > data. > > Looking into arrow, while arrow-format/flatbuffers does support zero > copy, > > current arrow-vector java implementation is not. I was trying to hack > zero > > copy for readonly scenarios, but saw two main blockers: > > > > 1. > > > > ArrowBuf is the only buffer implementation used exclusively across > > ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there > > isn't a > > way for me to override its logic in order to wrap some existing > buffer. > > It's absolutely necessary to use ArrowBuf for write scenarios due to > > buffer > > allocation, but for read, I was hoping vector can just serve as view > on > > top > > of existing memory buffer (like java ByteBuffer or netty ByteBuf). > Seems > > safe for read only case. > > 2. > > > > As a result of #1 <https://github.com/apache/arrow/pull/1> described > > above, the only layer which seems reusable is the arrow-format. Then I > > have > > to implement effectively a readonly copy of arrow-vector that > references > > existing buffer. Put aside the effort doing that, it introduces a big > > gap > > to keep up with future changes/fixes made to arrow-vector. > > > > Wondering if you guys have put any thoughts into such readonly scenarios. > > Any suggestion how I can approach this myself? > > > > Thanks > > >
