hi Miki, That link didn't work for me. Would it not be better to do this work in Apache Arrow rather than an external project? I would guess the community would be interested in this.
- Wes On Mon, May 20, 2019 at 9:48 AM Miki Tebeka <[email protected]> wrote: > > OK, almost working. I get "Write out of bounds" when running the code at > https://github.com/353solutions/carrow/blob/plasma/plasma.cc > > Any ideas? > > Full output: > batch size = 224 > buf size = 224 > error: write: Write out of bounds > > On Mon, May 20, 2019 at 5:21 PM Miki Tebeka <[email protected]> wrote: >> >> Thanks Wes >> >> On Mon, May 20, 2019 at 4:24 PM Wes McKinney <[email protected]> wrote: >>> >>> See https://issues.apache.org/jira/browse/ARROW-5377 >>> >>> On Mon, May 20, 2019 at 8:15 AM Wes McKinney <[email protected]> wrote: >>> > >>> > hi Miki, >>> > >>> > Steps >>> > >>> > * Convert the Table to a sequence of RecordBatch objects. You can use >>> > arrow::TableBatchReader to do this [1] >>> > * Write a stream using MockOutputStream [2] >>> > * Use the reported size of the total stream to allocate memory in Plasma >>> > * Write a real stream using arrow::io::FixedSizeBufferWriter >>> > >>> > I'm interested at some point to reduce the amount of boilerplate >>> > associated with this process, and also to avoid multiple metadata >>> > serialization and record batch disassembly steps. I'll open a JIRA >>> > issue >>> > >>> > We'd be delighted if you would contribute to the C++ documentation at >>> > https://github.com/apache/arrow/tree/master/docs/source/cpp >>> > >>> > - Wes >>> > >>> > [1]: >>> > https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L340 >>> > [2]: >>> > https://github.com/apache/arrow/blob/7a5562174cffb21b16f990f64d114c1a94a30556/cpp/src/arrow/io/memory.h#L89 >>> > >>> > On Mon, May 20, 2019 at 7:24 AM Miki Tebeka <[email protected]> wrote: >>> > > >>> > > Hi, >>> > > >>> > > I'm looking for an example on how to store/retrieve a an arrow::Table >>> > > in plasma. The examples I see in the documentation site are for basic >>> > > types. >>> > > >>> > > My end goal is to create data (Table) in C++, store it in plasma and >>> > > read if from Python. >>> > > >>> > > From reading around, I need to allocate buffer in plasma, but how can I >>> > > find the size of the Table to allocate the table? And how can I >>> > > serialize it into the created Buffer? >>> > > >>> > > Thanks, >>> > > Miki
