Wes McKinney created ARROW-5377:
-----------------------------------
Summary: [C++] Develop interface for writing a RecordBatch IPC
stream into pre-allocated space (e.g. memory map) that avoids unnecessary
serialization
Key: ARROW-5377
URL: https://issues.apache.org/jira/browse/ARROW-5377
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Wes McKinney
As discussed in recent mailing list thread
https://lists.apache.org/thread.html/b756209052fecb8c28a5eb37db7aecb82a5f5351fa79a9d86f0dba3e@%3Cuser.arrow.apache.org%3E
The only viable process at the moment for getting an accurate report of stream
size is to write a simulated stream using {{MockOutputStream}}. This is
suboptimal for a couple of reasons:
* Flatbuffers metadata must be created twice
* Record batch disassembly into IpcPayload must be performed twice
It seems like an interface with a very constrained public API could be provided
to deconstruct a sequence of RecordBatches and report the size of the produced
IPC stream (based on metadata sizes, and padding), and then this deconstructed
set of IPC payloads can be written out to a stream (e.g. using
{{FixedSizeBufferWriter}})
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)