+1

Thanks Kou! 

On Fri, Aug 11, 2023, at 06:03, Andrew Lamb wrote:
> +1
>
> On Thu, Aug 10, 2023 at 10:11 PM Sutou Kouhei <k...@clear-code.com> wrote:
>
>> +1
>>
>> In <20230811.105721.969618469307878987....@clear-code.com>
>>   "[VOTE][Format][Flight] Long-running queries support" on Fri, 11 Aug
>> 2023 10:57:21 +0900 (JST),
>>   Sutou Kouhei <k...@clear-code.com> wrote:
>>
>> > Hi,
>> >
>> > I would like to propose long-running queries support for
>> > Flight RPC.
>> >
>> > See the following pull request and discussion for details:
>> >
>> > * GH-36155: [C++][Go][Java][FlightRPC] Add support for long-running
>> queries
>> >   https://github.com/apache/arrow/pull/36946
>> >
>> > * [DISCUSS][Format][Flight] Long-running queries support
>> >   https://lists.apache.org/thread/qcjpcw6m3p15wqxp6n6rqzlx01v1fl3v
>> >
>> > This is based on one of the following proposals:
>> >
>> >   [DISCUSS] Flight RPC/Flight SQL/ADBC enhancements
>> >   https://lists.apache.org/thread/247z3t06mf132nocngc1jkp3oqglz7jp
>> >
>> >   Google Docs: (Arrow ML) Arrow Flight RPC/Flight SQL Proposals
>> >
>> https://docs.google.com/document/d/1jhPyPZSOo2iy0LqIJVUs9KWPyFULVFJXTILDfkadx2g/edit#heading=h.anpr1q5slm1v
>> >
>> > Summary:
>> >
>> > * Background: Queries generally don't complete instantly (as
>> >   much as we would like them to). So where can we put the
>> >   'query evaluation time'?
>> >
>> >   * In GetFlightInfo: block and wait for the query to complete.
>> >     * Con: this is a long-running blocking call, which may
>> >       fail or time out. Then when the client retries, the
>> >       server has to redo all the work.
>> >     * Con: parts of the result may be ready before others, but
>> >       the client can't do anything until everything is ready.
>> >
>> >   * In DoGet: return a fixed number of partitions
>> >     * Con: this makes handling worker failures hard. Systems
>> >       like Trino support fault-tolerant execution by replacing
>> >       workers at runtime. But GetFlightInfo has already
>> >       passed, so we can't notify the client of new workers.
>> >     * Con: we have to know or fix the partitioning up front.
>> >
>> >   Neither solution is optimal.
>> >
>> > * Proposal: Add PollFlightInfo as a pollable version of
>> >   GetFlightInfo. Clients can poll the current query status
>> >   and start reading the currently available results so far
>> >   before the query is completed.
>> >
>> > * Changes:
>> >
>> >   * Add PollFlightInfo and PollInfo
>> >
>> >     Flight.proto:
>> >
>> https://github.com/apache/arrow/pull/36946/files#diff-53b6c132dcc789483c879f667a1c675792b77aae9a056b257d6b20287bb09dba
>> >     Documentation:
>> >
>> http://crossbow.voltrondata.com/pr_docs/36946/format/Flight.html#downloading-data-by-running-a-heavy-query
>> >
>> >   * The pull request includes reference implementations for
>> >     C++, Go and Java.
>> >
>> >
>> > The vote will be open for at least 72 hours.
>> >
>> > [ ] +1 Accept this proposal
>> > [ ] +0
>> > [ ] -1 Do not accept this proposal because...
>> >
>> >
>> > Thanks,
>> > --
>> > kou
>>

Reply via email to