paleolimbot opened a new issue, #91: URL: https://github.com/apache/arrow-nanoarrow/issues/91
The initial proof-of-concept IPC reader (#61) only decodes the Schema message. In addition, support needs to be added for the RecordBatch message type defined at https://github.com/apache/arrow/blob/master/format/Message.fbs#L81-L102 . This should be more or less a mechanical (i.e., type-agnostic) process...nanoarrow exposes logic to expose the number of buffers for each type and to check the lengths of buffers. Some new checks might have to be added to nanoarrow (e.g., checking all offset values instead of just the last offset value) to ensure that downstream readers won't access memory they aren't supposed to. The easiest way to implement this involves a copy operation for each buffer. I will have a closer look at the C++ implementation but I am guessing that there is an option - or maybe it's the default - to make this a zero-copy operation (e.g., in case the input buffer is memory-mapped). In this case the caller would have to provide a buffer deallocator for the input and when constructing the buffers we'd have to use a shared pointer or something to make sure it stays alive for the duration of the longest-lived buffer. Enabling the zero-copy bit is probably best suited for a follow-up PR since it may involve at least one non-trivial build system update (using C++). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
