0.5 GB/second for local Flight transfer seems unexpectedly slow (one could expect 10x more), but perhaps tuning of default parameters needs to be improving. David Li can probably elaborate on that.
I'll add that Unix sockets might not be the fastest anymore these days. It may be worth testing on TCP.
Regards Antoine. Le 15/03/2023 à 22:23, Will Jones a écrit :
Hello all, First, a reminder that Plasma has been deprecated and will be removed in the 12.0.0 release of the C++, Python, and Java Arrow libraries. [1] I know some used Plasma as a convenient way to share Arrow data between Python processes, so I pulled together a quick performance comparison against two supported alternatives: Flight over unix domain socket and the Python sharedmemory module. [2] The shared memory example performs comparably to Plasma, but I don't think is accessible from other languages. The Flight test is slower than shared memory, but still fairly fast, and of course works across languages. I wrote a little more about the shared memory case in a stackoverflow answer [3]. If you have migrated off of Plasma and want to share with other users what you moved to, please do so in this thread. Best, Will Jones [1] https://github.com/apache/arrow/issues/33243 [2] https://github.com/wjones127/arrow-ipc-bench [3] https://stackoverflow.com/a/75402621/2048858