+1 Thanks Kou!
On Fri, Aug 11, 2023, at 06:03, Andrew Lamb wrote: > +1 > > On Thu, Aug 10, 2023 at 10:11 PM Sutou Kouhei <k...@clear-code.com> wrote: > >> +1 >> >> In <20230811.105721.969618469307878987....@clear-code.com> >> "[VOTE][Format][Flight] Long-running queries support" on Fri, 11 Aug >> 2023 10:57:21 +0900 (JST), >> Sutou Kouhei <k...@clear-code.com> wrote: >> >> > Hi, >> > >> > I would like to propose long-running queries support for >> > Flight RPC. >> > >> > See the following pull request and discussion for details: >> > >> > * GH-36155: [C++][Go][Java][FlightRPC] Add support for long-running >> queries >> > https://github.com/apache/arrow/pull/36946 >> > >> > * [DISCUSS][Format][Flight] Long-running queries support >> > https://lists.apache.org/thread/qcjpcw6m3p15wqxp6n6rqzlx01v1fl3v >> > >> > This is based on one of the following proposals: >> > >> > [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements >> > https://lists.apache.org/thread/247z3t06mf132nocngc1jkp3oqglz7jp >> > >> > Google Docs: (Arrow ML) Arrow Flight RPC/Flight SQL Proposals >> > >> https://docs.google.com/document/d/1jhPyPZSOo2iy0LqIJVUs9KWPyFULVFJXTILDfkadx2g/edit#heading=h.anpr1q5slm1v >> > >> > Summary: >> > >> > * Background: Queries generally don't complete instantly (as >> > much as we would like them to). So where can we put the >> > 'query evaluation time'? >> > >> > * In GetFlightInfo: block and wait for the query to complete. >> > * Con: this is a long-running blocking call, which may >> > fail or time out. Then when the client retries, the >> > server has to redo all the work. >> > * Con: parts of the result may be ready before others, but >> > the client can't do anything until everything is ready. >> > >> > * In DoGet: return a fixed number of partitions >> > * Con: this makes handling worker failures hard. Systems >> > like Trino support fault-tolerant execution by replacing >> > workers at runtime. But GetFlightInfo has already >> > passed, so we can't notify the client of new workers. >> > * Con: we have to know or fix the partitioning up front. >> > >> > Neither solution is optimal. >> > >> > * Proposal: Add PollFlightInfo as a pollable version of >> > GetFlightInfo. Clients can poll the current query status >> > and start reading the currently available results so far >> > before the query is completed. >> > >> > * Changes: >> > >> > * Add PollFlightInfo and PollInfo >> > >> > Flight.proto: >> > >> https://github.com/apache/arrow/pull/36946/files#diff-53b6c132dcc789483c879f667a1c675792b77aae9a056b257d6b20287bb09dba >> > Documentation: >> > >> http://crossbow.voltrondata.com/pr_docs/36946/format/Flight.html#downloading-data-by-running-a-heavy-query >> > >> > * The pull request includes reference implementations for >> > C++, Go and Java. >> > >> > >> > The vote will be open for at least 72 hours. >> > >> > [ ] +1 Accept this proposal >> > [ ] +0 >> > [ ] -1 Do not accept this proposal because... >> > >> > >> > Thanks, >> > -- >> > kou >>