Several recent discussions have highlighted the lack of an established specification / protocol for sending Arrow-formatted data through REST APIs. I would like to start a discussion here to gauge interest and gather ideas about this.
For background: Flight RPC provides a framework for building RPC APIs that exchange Arrow-formatted data. The two salient facts about Flight RPC are: (1) It uses the Arrow format as its data serialization format. (2) It is an RPC framework, built on gRPC, with HTTP/2 as the transfer protocol. Both of these design choices were made to optimize performance. But over time, we've seen that much of the performance benefit can be achieved with (1) alone. We've seen examples of REST services that exchange Arrow-formatted data with HTTP/1.1 as the transfer protocol and manage to achieve very good performance. Often this is the most viable approach, particularly in the case where there's a requirement to build on top of an existing REST API instead of building a new RPC API. But since there is no standard protocol for implementing exchange of Arrow-formatted data in REST services, we see different REST APIs implementing this in different ways. The implementations are bespoke and incompatible, they might be designed sub-optimally, and developer time is wasted writing custom code in different languages / libraries to exchange Arrow-formatted data across these different REST APIs. I think it would make sense for the Arrow project to establish a standard protocol for this. I believe this would accelerate adoption of Arrow as a format for exchanging data across REST APIs. It would increase convenience and compatibility and reduce implementation complexity. Compared to Flight RPC or Flight SQL, this protocol would be much smaller in scope. It could consist only of a specification for how to implement support for exchanging Arrow-formatted data in an existing REST API. Services that implemented this would wrap it in their own REST APIs. This protocol specification would be concerned only with the subset of the API that involved sending/receiving Arrow formatted data. Input appreciated from anyone in the community who might be interested in using this or contributing to this. Ian