I ran into this issue as well. See https://issues.apache.org/jira/projects/ARROW/issues/ARROW-11696 <https://issues.apache.org/jira/projects/ARROW/issues/ARROW-11696?filter=allopenissues>. As I understand it, https://github.com/jorgecarleitao/arrow2 will make reading more efficient.
On Apr 6, 2021 at 07:30:54, Al Taylor <[email protected]> wrote: > Hi, > > I was reading around the rust-arrow codebase, evaluating it for potential > future use. I'm particularly interested in zero-copy processing. > > I could very well be wrong here, as I don't have a lot of rust experience, > but it looks like the code for reading buffers out of IPC messages is > copying the contents of the data. > > A buffer is constructed via Buffer::from(&[u8]) according to an > ipc::Buffer object here: > > https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/rust/arrow/src/ipc/reader.rs#L39 > > The Buffer::from method is defined here: > https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/rust/arrow/src/buffer/immutable.rs#L222 > and appears to create an empty buffer, then copy data from the slice into > it. > > As far as I'm aware, the C++ implementation of arrow does not copy buffer > data out of IPC messages in this way. > > Is my understanding correct? Are there additional considerations in rust > which means it's not possible for objects accessed by application code > (e.g. record-batches, arrays) to keep their buffers in IPC messages? > > Thanks, > > Al > > >
