Hey Steve -

You could look at Flight SQL, which uses this pattern: 
https://arrow.apache.org/docs/format/FlightSql.html#sequence-diagrams That way, 
a system can execute the query on multiple machines, and let the client fetch 
data from all those machines, instead of having to proxy data through a central 
endpoint. Dremio implements (and originally contributed) Flight SQL in their 
product.

The BigQuery Storage API, while not Flight based, is also gRPC/Arrow-based and 
does something similar (creating a ReadSession returns multiple ReadStreams, 
which the client then fetches individually): 
https://github.com/googleapis/googleapis/blob/master/google/cloud/bigquery/storage/v1/stream.proto

There is nothing stopping you from sending a ticket directly if that fits the 
needs of your own service, although it wouldn't fit the recommended pattern. 
There is also a proposal to let data be directly inlined into FlightInfo to get 
the 'best of both worlds': https://github.com/apache/arrow/pull/12571

-David

On Sat, Oct 1, 2022, at 18:28, Steve Kim wrote:
> I am exploring how to implement an API with Arrow Flight, and I am having 
> difficulty understanding why GetFlightInfo is a separate step from DoGet. I 
> imagine a Ticket to be a serialized application-specific request that is 
> understood by the server(s), and I don't understand why the client should 
> have to send a pre-flight command in a FlightDescriptor to obtain tickets, 
> instead of directly sending tickets to the server(s). Can someone share a 
> real-world, non-contrived example of a Flight service that relies on this 
> feature of the protocol to achieve performance, scalability, etc. goals?
> 
> Thanks,
> Steve

Reply via email to