I like this new direction, and I think it'll be actually viable unlike the Flight-UCX work that was attempted a couple years ago. I think the hardcoded endpoints in Flight RPC are difficult for other projects (including Flight SQL!) to build on top of, and we would serve users better by - standardizing the encoding of Arrow data across different transports like UCX (and as with Ian's HTTP proposal), - focusing on the part of Flight that can unify those transports as described here (rather than the whole "action"/"command" structure imposed on Flight users).
On Fri, Feb 2, 2024, at 18:22, Matt Topol wrote: > Hey all, > > In my current work I've been experimenting and playing around with > utilizing Arrow and non-cpu memory data. While the creation of the > ArrowDeviceArray struct and the enhancements to the Arrow library Device > abstractions were necessary, there is also a need to extend the > communications specs we utilize, i.e. Flight. > > Currently there is no real way to utilize Arrow Flight with shared memory > or with non-CPU memory (without an expensive Device -> Host copy first). To > this end I've done a bunch of research and toying around and came up with a > protocol to propose and a reference implementation using UCX[1]. Attached > to the proposal is also a couple extensions for Flight itself to make it > easier for users to still use Flight for metadata / dataset information and > then point consumers elsewhere to actually retrieve the data. The idea here > is that this would be a new specification for how to transport Arrow data > across these high-performance transports such as UCX / libfabric / shared > memory / etc. We wouldn't necessarily expose / directly add implementations > of the spec to the Arrow libraries, just provide reference/example > implementations. > > I've written the proposal up on a google doc[2] that everyone should be > able to comment on. Once we get some community discussion on there, if > everyone is okay with it I'd like eventually do a vote on adopting this > spec and if we do, I'll then make a PR to start adding it to the Arrow > documentation, etc. > > Anyways, thank you everyone in advance for your feedback and comments! > > --Matt > > [1]: https://github.com/openucx/ucx/ > [2]: > https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit?usp=sharing