I think this would be a useful feature and be nice to have in Flight core.
For cases like previewing data, you usually just want to get a small amount
of data quickly. Would it make sense to make this part of DoGet since it
still would be returning a record batch? Perhaps a Ticket could be made to
have an optional FlightDescriptor that would serve as an all-in-one shot?

On Tue, Mar 1, 2022 at 8:44 AM David Li <lidav...@apache.org> wrote:

> I agree with something along Antoine's proposal, though: maybe we should
> be more structured with the flags (akin to what Micah mentioned with the
> Feature enum).
>
> Also, the flag could be embedded into the Flight SQL messages instead. (So
> in effect, Flight would only add the capability to return data with
> FlightInfo, and it's up to applications, like Flight SQL, to decide how
> they want to take advantage of that.)
>
> I think having a completely separate method and return type and having to
> poll for it beforehand somewhat defeats the purpose of having it/would be
> much harder of a transition.
>
> Also: it should be `repeated FlightInfo inline_data` right? In case we
> also need dictionary batches?
>
> On Tue, Mar 1, 2022, at 11:39, Antoine Pitrou wrote:
> > Can we just add the following field to the FlightDescriptor message:
> >
> >   bool accept_inline_data = 4;
> >
> > and this one to the FlightInfo message:
> >
> >   FlightData inline_data = 100;
> >
> > Then new clients can `accept_inline_data` to true (the default being
> > false if omitted) to signal servers that they can put the data if
> > `inline_data` if deemed small enough.
> >
> > (the `accept_inline_data` field could also be used to the Criteria
> > message)
> >
> >
> > Alternatively, if the FlightDescriptor expansion looks a bit dirty
> > (FlightDescriptor being used in other contexts where
> > `accept_inline_data` makes no sense), we can instead define a new
> > method:
> >
> >   rpc GetFlightInfoEx(GetFlightInfoRequest) returns (FlightInfo) {}
> >
> > with:
> >
> > message GetFlightInfoRequest {
> >   FlightDescriptor flight_descriptor = 1;
> >   bool accept_inline_data = 2;
> > }
> >
> > Regards
> >
> > Antoine.
> >
> >
> > On Mon, 28 Feb 2022 11:29:12 -0800
> > James Duong <jam...@bitquilltech.com.INVALID> wrote:
> >> This seems reasonable, however we need to account for existing Flight
> >> clients that were written before this.
> >>
> >> It seems like the server will need to still handle the ticket returned
> for
> >> getStream() for clients that are unaware of the small result
> optimization.
> >>
> >> On Mon, Feb 28, 2022 at 11:26 AM David Li <lidav...@apache.org> wrote:
> >>
> >> > Ah, that makes more sense, that would be a reasonable extension to
> Flight
> >> > overall. (While we're at it, I think it would help to have an
> app_metadata
> >> > field in FlightInfo as well.)
> >> >
> >> > On Mon, Feb 28, 2022, at 14:24, Micah Kornfield wrote:
> >> > >>
> >> > >> But it seems reasonable to add a one-shot query path using DoGet.
> >> > >
> >> > >
> >> > > I was thinking more of adding a bytes field to FlightInfo that
> could
> >> > store
> >> > > arrow data.  That way GetFlightInfo would be the only RPC necessary
> for
> >> > > small results when executing a CMD.  The client doesn't necessarily
> know
> >> > > whether a query will return large or small results.
> >> > >
> >> > > On Mon, Feb 28, 2022 at 11:04 AM David Li <lidav...@apache.org>
> wrote:
> >> > >
> >> > >> I think the focus was on large result sets (though I don't recall
> this
> >> > >> being discussed before) and supporting multi-node setups (hence
> >> > >> GetFlightInfo/DoGet are separated). But it seems reasonable to add
> a
> >> > >> one-shot query path using DoGet.
> >> > >>
> >> > >> On Mon, Feb 28, 2022, at 13:32, Adam Lippai wrote:
> >> > >> > I saw the same. A small, stateless query ability would be nice
> >> > >> (connection
> >> > >> > open, initialization, query in one message, the resultset in
> the
> >> > response
> >> > >> > in one message)
> >> > >> >
> >> > >> > On Mon, Feb 28, 2022, 13:12 Micah Kornfield <
> emkornfi...@gmail.com>
> >> > >> wrote:
> >> > >> >
> >> > >> >> I'm rereviewing the Flight SQL interfaces, and I'm not sure if
> I'm
> >> > >> missing
> >> > >> >> it but is there any optimization for small results?  My concern
> is
> >> > that
> >> > >> the
> >> > >> >> overhead of the RPCs for the DoGet after executing the query
> could
> >> > add
> >> > >> >> non-trivial latency for smaller results.
> >> > >> >>
> >> > >> >> Has anybody else thought about this/investigated it?  Am I
> >> > understanding
> >> > >> >> this correctly?
> >> > >> >>
> >> > >> >> Thanks,
> >> > >> >> Micah
> >> > >> >>
> >> > >>
> >> >
> >>
> >>
>

Reply via email to