I like this new direction, and I think it'll be actually viable unlike the 
Flight-UCX work that was attempted a couple years ago. I think the hardcoded 
endpoints in Flight RPC are difficult for other projects (including Flight 
SQL!) to build on top of, and we would serve users better by 
- standardizing the encoding of Arrow data across different transports like UCX 
(and as with Ian's HTTP proposal),
- focusing on the part of Flight that can unify those transports as described 
here (rather than the whole "action"/"command" structure imposed on Flight 
users).

On Fri, Feb 2, 2024, at 18:22, Matt Topol wrote:
> Hey all,
>
> In my current work I've been experimenting and playing around with
> utilizing Arrow and non-cpu memory data. While the creation of the
> ArrowDeviceArray struct and the enhancements to the Arrow library Device
> abstractions were necessary, there is also a need to extend the
> communications specs we utilize, i.e. Flight.
>
> Currently there is no real way to utilize Arrow Flight with shared memory
> or with non-CPU memory (without an expensive Device -> Host copy first). To
> this end I've done a bunch of research and toying around and came up with a
> protocol to propose and a reference implementation using UCX[1]. Attached
> to the proposal is also a couple extensions for Flight itself to make it
> easier for users to still use Flight for metadata / dataset information and
> then point consumers elsewhere to actually retrieve the data. The idea here
> is that this would be a new specification for how to transport Arrow data
> across these high-performance transports such as UCX / libfabric / shared
> memory / etc. We wouldn't necessarily expose / directly add implementations
> of the spec to the Arrow libraries, just provide reference/example
> implementations.
>
> I've written the proposal up on a google doc[2] that everyone should be
> able to comment on. Once we get some community discussion on there, if
> everyone is okay with it I'd like eventually do a vote on adopting this
> spec and if we do, I'll then make a PR to start adding it to the Arrow
> documentation, etc.
>
> Anyways, thank you everyone in advance for your feedback and comments!
>
> --Matt
>
> [1]: https://github.com/openucx/ucx/
> [2]:
> https://docs.google.com/document/d/1zHbnyK1r6KHpMOtEdIg1EZKNzHx-MVgUMOzB87GuXyk/edit?usp=sharing

Reply via email to