Thanks Wes!

On Mon, May 20, 2019 at 9:46 PM Wes McKinney <[email protected]> wrote:

> hi Miki,
>
> In
>
> https://github.com/353solutions/carrow/blob/plasma/_misc/plasma.cc#L47
>
> GetRecordBatchSize does not represent the entire size of the stream
> including schema. If you are serializing Schema separate from
> RecordBatch then you need to use the lower level
> arrow::ipc::ReadRecordBatch/WriteRecordBatch functions. Have a look at
> the unit tests
>
> If you are going to use RecordBatchStreamWriter then you need to
> compute the size using MockOutputStream per my original e-mail
>
> - Wes
>
> On Mon, May 20, 2019 at 12:50 PM Miki Tebeka <[email protected]>
> wrote:
> >>
> >> That link didn't work for me.
> >
> > Doh! I moved it to
> https://github.com/353solutions/carrow/blob/plasma/_misc/plasma.cc
> >
> >>
> >> Would it not be better to do this work in Apache Arrow rather than an
> external project? I would guess the
> >> community would be interested in this.
> >
> > I do plan to suggest this as a patch to arrow once the code is usable,
> currently it's just noise.
> >
> > The idea behind carrow is to use the underlying C++ both in Python & Go
> so that in the same process we can simply share pointers (and maybe later
> used shared memory allocator to do it between processes).  I don't see a
> clear path to do it with the current Go implementation since it's uses the
> Go runtime to allocate memory, and carrow has a complicated build process
> that currently won't with with simple "go get".
> >
> > To get initial usable Go<->Python IPC quickly, I'm trying to utilize
> plasma for now. However in the long run I'd like to just share pointers
> with no serializaton at all.
> >
> > I'd love to discuss how we can make this project usable and get the
> community help in solving some "easy of build" issues later on. Would love
> to have it in the main arrow eventually.
>

Reply via email to