Seems like you should be able to construct an UnsafeDirectByteBuf from a MappedByteBuffer, and then wrap that with UnsafeDirectLittleEndian to get zero-copy access to a memory map. Does that sound right?
https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/UnpooledUnsafeDirectByteBuf.java On Fri, Sep 7, 2018 at 12:46 PM Zhenyuan Zhao <[email protected]> wrote: > > Interesting, so basically I can still use the public constructor > > public ArrowBuf(AtomicInteger refCnt, BufferLedger ledger, > UnsafeDirectLittleEndian byteBuf, BufferManager manager, > ArrowByteBufAllocator alloc, int offset, int length, boolean isEmpty) > > Instead, override BufferLedger/UnsafeDirectLittleEndian/BufferManager to > make it reference existing buffer. That is a much more plausible option as > it will reuse the Vectors. All I need is to implement my own deserializer. > Did I get you right? > > Thanks > > On Fri, Sep 7, 2018 at 7:09 AM Jacques Nadeau <[email protected]> wrote: > > > It is on purpose that the ArrowBuf is final. It is done to ensure a single > > impl and performance reasons. ArrowBuf is primarily a memory address and a > > length and wants zero indirection to the reading/writing of that. > > > > It does, however, wrap several types of substructures as long as they have > > that property. For example, an ArrowBuf almost always currently wraps a > > Netty UnsafeDirectLittleEndian object. At that level you could propose a > > way to wrap more types of memory addresses+lengths. > > > > On Thu, Sep 6, 2018, 10:26 PM Zhenyuan Zhao <[email protected]> wrote: > > > > > Hello Team, > > > > > > I'm working on using arrow as intermediate format for transferring > > columnar > > > data from server to client. In this case, the client will only need to > > read > > > from the format so I would like to avoid any unnecessary copy of the > > data. > > > Looking into arrow, while arrow-format/flatbuffers does support zero > > copy, > > > current arrow-vector java implementation is not. I was trying to hack > > zero > > > copy for readonly scenarios, but saw two main blockers: > > > > > > 1. > > > > > > ArrowBuf is the only buffer implementation used exclusively across > > > ArrowReader/ArrowRecordBatch/Vectors. It's final, which means there > > > isn't a > > > way for me to override its logic in order to wrap some existing > > buffer. > > > It's absolutely necessary to use ArrowBuf for write scenarios due to > > > buffer > > > allocation, but for read, I was hoping vector can just serve as view > > on > > > top > > > of existing memory buffer (like java ByteBuffer or netty ByteBuf). > > Seems > > > safe for read only case. > > > 2. > > > > > > As a result of #1 <https://github.com/apache/arrow/pull/1> described > > > above, the only layer which seems reusable is the arrow-format. Then I > > > have > > > to implement effectively a readonly copy of arrow-vector that > > references > > > existing buffer. Put aside the effort doing that, it introduces a big > > > gap > > > to keep up with future changes/fixes made to arrow-vector. > > > > > > Wondering if you guys have put any thoughts into such readonly scenarios. > > > Any suggestion how I can approach this myself? > > > > > > Thanks > > > > >
