Having not yet looked at the amount of work implementing plasma in Go is, you may just ignore me :) but I think implementing a shared memory Go allocator to be easier (as in less human hours to implement).
Another option could be to have a CGo package exposing a set of functions (compiled as a C shlib) that call into the Go based arrow package to do what you need. -s sent from my droid On Mon, Jul 8, 2019, 10:30 Clive Cox <[email protected]> wrote: > > Thanks for all the informative replies. > > In our case the Python and Go would be in separate processes. So for that > as I understand the conversation so far the options are: > > - Use of Plasma. This requires pending updates for the current Go > implementation? (happy to help here) > - IPC - but this will require sending the data over the wire? > > Thanks, > > Clive > > > > > > On Mon, 8 Jul 2019 at 09:05, Uwe L. Korn <[email protected]> wrote: > >> Hello all, >> >> I've been using the in-process sharing method for quite some time for the >> Python<->Java interaction and I really like the ease of doing it all in the >> same process. Especially as this avoids any memory-copy or shared memory >> handling. This is really useful for the case where you only want to call a >> single routine in another language. >> >> Thus I would really like to see this also implemented for Go (and Rust) >> so that one can build custom UDFs in it and use them from Python code. The >> pre-conditions for this are that we have IPC tests that verify that both >> libraries use the exact same memory layout and that we can pull out the >> memory pointer from the Go Arrow structures into the C++ memory structures >> and also keep a reference between both so that memory tracking doesn't >> deallocate the underlying memory. For that we have in Python the >> pyarrow.foreign_buffer >> https://github.com/apache/arrow/blob/1b798a317df719d32312ca2c3253a2e399e949b8/python/pyarrow/io.pxi#L1276-L1292 >> function. >> >> For the Go<->Python case, I would though recommend to solve this as a >> Go<->C++ interface as this would make interaction for all the libraries >> based on the C++ one (like R, Ruby, ..) possible. >> >> Uwe >> >> On Mon, Jul 8, 2019, at 9:57 AM, Miki Tebeka wrote: >> >> My bad, IPC in Go seems to be implemented - >> https://issues.apache.org/jira/browse/ARROW-3679 >> >> On Mon, Jul 8, 2019 at 10:18 AM Sebastien Binet <[email protected]> >> wrote: >> >> As far as i know, Go does support IPC (as in the arrow IPC format) >> >> Another option which has been discussed at some point was to have a >> shared memory allocator so the arrow arrays could be shared between >> processes. >> >> I haven't looked in details what implementing plasma support for Go would >> need on the Go side... >> >> -s >> >> >> sent from my droid >> >> On Mon, Jul 8, 2019, 08:29 Miki Tebeka <[email protected]> wrote: >> >> Hi Clive, >> >> I'd like to understand the high level design for a system where a Go >> process can communicate an Arrow data structure to a python process on the >> same CPU >> >> I see two options >> - Different processes with hared memory, probably using plasma >> - Same process. The either Go uses Python shared library or Python using >> Go compiled to shared library (-build-mode=c-shared) >> >> >> - and for the python process to zero-copy gain access to that data, >> change it and inform the Go process. This is low latency so I don't want >> to save to file. >> >> IIRC arrow is not built for mutation. You build an Array/Table once and >> then use it. >> >> Would this need the use of Plasma as a zero-copy store for the data >> between the two processes or do I need to use IPC? But with IPC you are >> transferring the data which is not needed in this case as I understand it. >> Any pointers to examples would be appreciated. >> >> See above about options. Note that currently the Go arrow implementation >> doesn't support IPC or plasma (though it's in the works). >> >> Yoni & I are working on another option which is using the C++ arrow >> library from Go. It does support plasma and since it uses the same >> underlying C++ library that Python does you'll be able to pass a pointer >> around without copying data. It's at very alpha-ish state but you're more >> than welcomed to give it a try - https://github.com/353solutions/carrow >> >> Happy hacking, >> Miki >> >> >> > > -- > > > <https://www.seldon.io> > Seldon Technologies Ltd, Rise London, 41 Luke Street, Shoreditch, EC2A 4DP > (map <https://goo.gl/maps/BbJgCdNso5Q2>). Registered in England & Wales, > No. 9188032. VAT GB 258424587. Privacy Policy > <https://www.seldon.io/privacy/>. >
