Hi, I would like to propose adding support for long-running queries to Apache Arrow Flight. If anyone has comments for this proposal, please share them at here or the issue for this proposal: https://github.com/apache/arrow/issues/36155
Implementation in C++/Go/Java: https://github.com/apache/arrow/pull/36946 Documentations: http://crossbow.voltrondata.com/pr_docs/36946/format/Flight.html#downloading-data-by-running-a-heavy-query This is one of proposals in "[DISCUSS] Flight RPC/Flight SQL/ADBC enhancements": https://lists.apache.org/thread/247z3t06mf132nocngc1jkp3oqglz7jp See also the "Flight RPC: Long-Running Queries" section in the design document for the proposals: https://docs.google.com/document/d/1jhPyPZSOo2iy0LqIJVUs9KWPyFULVFJXTILDfkadx2g/edit# Changes since the original proposal: * Removed RetryInfo.cancel_descriptor because we can use the existing CancelFlightInfo action. * Renamed RetryInfo.retry_descriptor to RetryInfo.flight_descriptor because we have only one descriptor by removing RetryInfo.cancel_descriptor. Background: Queries generally don't complete instantly (as much as we would like them to). So where can we put the 'query evaluation time'? * In GetFlightInfo: block and wait for the query to complete. * Con: this is a long-running blocking call, which may fail or time out. Then when the client retries, the server has to redo all the work. * Con: parts of the result may be ready before others, but the client can't do anything until everything is ready. * In DoGet: return a fixed number of partitions * Con: this makes handling worker failures hard. Systems like Trino support fault-tolerant execution by replacing workers at runtime. But GetFlightInfo has already passed, so we can't notify the client of new workers2. * Con: we have to know or fix the partitioning up front. Neither solution is optimal. Proposal: Add PollFlightInfo as a retryable version of GetFlightInfo. Clients can poll the current query status and start reading the currently available results so far before the query is completed. See documentation for details: http://crossbow.voltrondata.com/pr_docs/36946/format/Flight.html#downloading-data-by-running-a-heavy-query Implementation: https://github.com/apache/arrow/pull/36946 is an implementation of this proposal. The pull request has the followings: 1. Format changes: * format/Flight.proto https://github.com/apache/arrow/pull/36946/files#diff-53b6c132dcc789483c879f667a1c675792b77aae9a056b257d6b20287bb09dba 2. Documentation changes: docs/source/format/Flight.rst https://github.com/apache/arrow/pull/36946/files#diff-839518fb41e923de682e8587f0b6fdb00eb8f3361d360c2f7249284a136a7d89 3. The C++ implementation and an integration test: * cpp/src/arrow/flight/ 4. The Go implementation and an integration test: * go/arrow/flight/server.go * go/arrow/internal/flight_integration/scenario.go 5. The Java implementation and an integration test: * java/flight/flight-core/ * java/flight/flight-integration-tests/ Next: I'll start a vote for this proposal after we reach a consensus on this proposal. It's the standard process for format change. See also: https://arrow.apache.org/docs/dev/format/Changing.html Thanks, -- kou