[GitHub] [arrow-nanoarrow] paleolimbot opened a new issue, #91: [Extension-IPC] Implement RecordBatch message decoding

GitBox Mon, 16 Jan 2023 10:52:02 -0800


paleolimbot opened a new issue, #91:
URL: https://github.com/apache/arrow-nanoarrow/issues/91


   The initial proof-of-concept IPC reader (#61) only decodes the Schema 
message. In addition, support needs to be added for the RecordBatch message 
type defined at 
https://github.com/apache/arrow/blob/master/format/Message.fbs#L81-L102 . This 
should be more or less a mechanical (i.e., type-agnostic) process...nanoarrow 
exposes logic to expose the number of buffers for each type and to check the 
lengths of buffers. Some new checks might have to be added to nanoarrow (e.g., 
checking all offset values instead of just the last offset value) to ensure 
that downstream readers won't access memory they aren't supposed to.
   
   The easiest way to implement this involves a copy operation for each buffer. 
I will have a closer look at the C++ implementation but I am guessing that 
there is an option - or maybe it's the default - to make this a zero-copy 
operation (e.g., in case the input buffer is memory-mapped). In this case the 
caller would have to provide a buffer deallocator for the input and when 
constructing the buffers we'd have to use a shared pointer or something to make 
sure it stays alive for the duration of the longest-lived buffer. Enabling the 
zero-copy bit is probably best suited for a follow-up PR since it may involve 
at least one non-trivial build system update (using C++).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-nanoarrow] paleolimbot opened a new issue, #91: [Extension-IPC] Implement RecordBatch message decoding

Reply via email to