To touch on the question about supported features -- is it possible to
advertise arbitrary/custom "capabilites" in GetSqlInfo?
Say that you want to represent some set of behaviors that FlightSQL
services can support.

Stuff like "Supports grouping by multiple distinct aggregates", "Supports
self-joins on aliased tables" etc
This is going to be unique to each implementation, but I couldn't determine
whether there was a way to express arbitrary capabilities

Also, in case it's helpful I put together an ASCII diagram of what I'm
trying to do with FlightSQL
If anyone has a moment, would appreciate input on whether it's feasible/a
good idea

https://pastebin.com/raw/VF2r0F3f

Thank you =)


On Fri, Mar 4, 2022 at 2:37 PM David Li <lidav...@apache.org> wrote:

> We could also add say CommandSubstraitQuery as a distinct message, and
> older servers would just reject it as an unknown request type.
>
> -David
>
> On Fri, Mar 4, 2022, at 17:01, Micah Kornfield wrote:
> >>
> >> 1. How does a server report that it supports each command type? Initial
> >> thought is a property in GetSqlInfo.
> >
> >
> > This sounds reasonable.
> >
> >
> >> What happens to client code written prior to changing the command type
> >> to be a oneOf field? Same for servers.
> >
> >
> > It is transparent from older clients (I'm 99% sure the wire protocol
> > doesn't change).  Servers is a little harder.  The one saving grace is I
> > don't think an empty/not-present SQL string would be something most
> servers
> > could handle, so they would probably error with something that while
> > not-obvious would give a clue to the clients (but hopefully this would
> be a
> > non-issue because the capabilities would be checked for clients wishing
> to
> > to use this feature first).
> >
> > -Micah
> >
> > On Fri, Mar 4, 2022 at 1:50 PM James Duong <jam...@bitquilltech.com
> .invalid>
> > wrote:
> >
> >> It sounds like an interesting and useful project to use Subtstrait as an
> >> alternative to SQL strings.
> >>
> >> Important aspects to spec out are:
> >> 1. How does a server report that it supports each command type? Initial
> >> thought is a property in GetSqlInfo.
> >> 2. What happens to client code written prior to changing the command
> type
> >> to be a oneOf field? Same for servers.
> >> More generally, how should backward compatibility work, and what should
> >> happen if a client sends an unsupported
> >> command type to a server.
> >> 3. Should inputs to catalog RPC calls also accept Substrait structures?
> >>
> >> On Thu, Mar 3, 2022 at 11:00 PM Gavin Ray <ray.gavi...@gmail.com>
> wrote:
> >>
> >> > @James Duong <jam...@bitquilltech.com>
> >> >
> >> > You are absolutely right, I realized this and confirmed whether this
> >> > would be possible with Jacques to double-check.
> >> > It would amount to what I might call "dollar-store Substrait." It's
> not
> >> > elegant or a good solution, but definitely presents a good duct-tape
> hack
> >> > and is a crafty idea.
> >> >
> >> > I agree with Jacques -- when you think about FlightSQL, what you are
> >> > attempting with a query isn't necessarily SQL, but a general
> data-compute
> >> > operation.
> >> > SQL just so happens to be a fairly universal way to express them,
> with an
> >> > ANSI standard, but FlightSQL doesn't recognize any particular subset
> of
> >> it
> >> > and for all intents and purposes it doesn't matter what the operation
> >> > string contains.
> >> >
> >> > Substrait would make a fantastic logical next-feature because it's
> >> > targeted as a specification for expressing relational algebra and
> >> > data-compute operations
> >> > This more-or-less equates to SQL strings (in my mind at least) with a
> >> much
> >> > better toolkit and Dev UX. If there is anything I can do to help move
> >> this
> >> > forward, please let me know because I am extremely motivated to do so.
> >> >
> >> > @David Li <git...@lidavidm.me>
> >> >
> >> > Also agreed. Substrait is put together by folks much smarter than
> myself,
> >> > and if I had to hedge my bets, I'd put money on it being the future of
> >> > data-compute interop.
> >> > I would love nothing more than to adopt this technology and push it
> >> along.
> >> >
> >> > Your project does sound interesting - basically, it sounds like a
> tabular
> >> >> data storage service with query pushdown?
> >> >>
> >> >
> >> > Yeah this is more or less the details of it (my personal email, with
> >> > discretion assumed, is always open)
> >> >
> >> > Imagine an environment where a backend wants to advertise some kind of
> >> > schema/data catalog
> >> >
> >> > And then a central service introspects these backends, and dynamically
> >> > generates an API from the data catalogues/schemas, where requests get
> >> > proxied to the underlying backend service for each schema to actually
> be
> >> > executed
> >> >
> >> > In text, the flow would look something like:
> >> >
> >> >
> >> >        <----> Data Provider Backend 0
> >> > Client <-----> Central Service <---> Generated API <---->
> Data-Provider
> >> > Backend 1
> >> >
> >> >        <----> Data Provider Backend 2
> >> >
> >> >
> >> >
> >> > On Thu, Mar 3, 2022 at 5:52 PM David Li <lidav...@apache.org> wrote:
> >> >
> >> >> Gavin, thanks for sharing. I'm not so sure you'll find an
> alternative to
> >> >> Substrait, at least one that isn't even more nascent or one that's
> very
> >> >> tied to a particular language, so perhaps it might be better to get
> >> >> involved in Substrait and see if it suits your needs? Convincing a
> team
> >> to
> >> >> try something new can be hard, though, and it is somewhat of a moving
> >> >> target - but Flight SQL is in a similar spot, I think, as it's still
> >> >> getting enhancements.
> >> >>
> >> >> Your project does sound interesting - basically, it sounds like a
> >> tabular
> >> >> data storage service with query pushdown?
> >> >>
> >> >> On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote:
> >> >> > James, I agree that you could use JSON but that feels a bit hacky
> >> >> > (mis-use
> >> >> > of the paradigm). Instead, I'd really like to do something like
> David
> >> is
> >> >> > suggesting: support Substrait as an alternative to a SQL string.
> >> >> > Something like this:
> >> >> >
> >> >>
> >>
> https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc
> >> >> >
> >> >> > It would be great if someone wanted to pick this up. It would be a
> >> nice
> >> >> > enhancement to FlightSQL (and provide a structured way to express
> >> >> > operations).
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Thu, Mar 3, 2022 at 4:56 PM James Duong <
> jam...@bitquilltech.com
> >> >> .invalid>
> >> >> > wrote:
> >> >> >
> >> >> >> In the same way that you could write an ODBC driver that takes in
> >> text
> >> >> >> that's not SQL, you could write a Flight SQL server that takes in
> >> text
> >> >> >> that's JSON.
> >> >> >> Flight SQL doesn't parse the query, so you could create commands
> that
> >> >> are
> >> >> >> just JSON text.
> >> >> >>
> >> >> >> Is that the only bit you need, Gavin?
> >> >> >>
> >> >> >> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray <ray.gavi...@gmail.com>
> >> >> wrote:
> >> >> >>
> >> >> >> > I am enthusiastic about Substrait and have followed it's
> progress
> >> >> eagerly
> >> >> >> > =D
> >> >> >> >
> >> >> >> > When I presented it as a tentative option, there were
> reservations
> >> >> >> because
> >> >> >> > of the project/spec being young and the functionality still
> being
> >> >> >> > fleshed out.
> >> >> >> > I think if I were having this conversation in say, 8-16 months,
> it
> >> >> would
> >> >> >> > have been an easy choice, no doubt.
> >> >> >> >
> >> >> >> > On a public mailing list (and I can share more details in
> private
> >> if
> >> >> >> you're
> >> >> >> > curious), the gist of it is this:
> >> >> >> >
> >> >> >> > Some well-defined/backed-by-mature tech solution for expressing
> >> data
> >> >> >> > compute operations between services would be a useful thing to
> have
> >> >> >> > (Especially if it's language-agnostic)
> >> >> >> >
> >> >> >> > The goal is for an "implementing service" to have:
> >> >> >> > - An introspectable schema (IE, "describe yourself to me")
> >> >> >> > - A query/operation execution endpoint (IE: "perform this
> operation
> >> >> on
> >> >> >> your
> >> >> >> > data")
> >> >> >> >
> >> >> >> > With FlightSQL this is possible I believe, but it requires the
> >> >> operation
> >> >> >> to
> >> >> >> > be expressed as a SQL string which isn't ideal.
> >> >> >> >
> >> >> >> > Working with some programmatic, structured object that has the
> same
> >> >> >> > semantics ("Logical Plan", or whatnot) as a SQL query would
> have,
> >> >> would
> >> >> >> be
> >> >> >> > a better experience
> >> >> >> > (Jacques is on to something here!)
> >> >> >> >
> >> >> >> > This interface between services would be somewhat the
> equivalent of
> >> >> an
> >> >> >> > "SDK", so it would be nice to have a strongly-typed library for
> >> >> >> expressing
> >> >> >> > and building-up query/data-compute ops.
> >> >> >> >
> >> >> >> >
> >> >> >> > On Thu, Mar 3, 2022 at 3:17 PM David Li <lidav...@apache.org>
> >> wrote:
> >> >> >> >
> >> >> >> > > You probably want Substrait: https://substrait.io/
> >> >> >> > >
> >> >> >> > > Which is being worked on by several people, including Arrow
> >> >> community
> >> >> >> > > members.
> >> >> >> > >
> >> >> >> > > It might be interesting to generalize Flight SQL to include
> >> >> support for
> >> >> >> > > Substrait. I'm curious what your application, if you're able
> to
> >> >> share
> >> >> >> > more.
> >> >> >> > >
> >> >> >> > > -David
> >> >> >> > >
> >> >> >> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote:
> >> >> >> > > > Hiya,
> >> >> >> > > >
> >> >> >> > > > I am drafting a proposal for a way to enable services to
> >> express
> >> >> data
> >> >> >> > > > compute operations to each other.
> >> >> >> > > >
> >> >> >> > > > However I think it'll be difficult to get buy-in if the only
> >> >> >> > > representation
> >> >> >> > > > for queries is as SQL strings.
> >> >> >> > > >
> >> >> >> > > > Is there any kind of lower-level API that can be used to
> >> express
> >> >> >> > > operations?
> >> >> >> > > >
> >> >> >> > > > IE instead of "SELECT name FROM user"
> >> >> >> > > >
> >> >> >> > > > A structured representation like:
> >> >> >> > > > {
> >> >> >> > > >   "op": "query",
> >> >> >> > > >   "schema": "user",
> >> >> >> > > >   "project": ["name"]
> >> >> >> > > > }
> >> >> >> > > >
> >> >> >> > > > Or maybe this is a bad idea/doesn't make sense?
> >> >> >> > > >
> >> >> >> > > > Thank you =)
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >>
> >> >> >> *James Duong*
> >> >> >> Lead Software Developer
> >> >> >> Bit Quill Technologies Inc.
> >> >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
> >> >> >> https://www.bitquilltech.com
> >> >> >>
> >> >> >> This email message is for the sole use of the intended
> recipient(s)
> >> >> and may
> >> >> >> contain confidential and privileged information.  Any unauthorized
> >> >> review,
> >> >> >> use, disclosure, or distribution is prohibited.  If you are not
> the
> >> >> >> intended recipient, please contact the sender by reply email and
> >> >> destroy
> >> >> >> all copies of the original message.  Thank you.
> >> >> >>
> >> >>
> >> >
> >>
> >> --
> >>
> >> *James Duong*
> >> Lead Software Developer
> >> Bit Quill Technologies Inc.
> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
> >> https://www.bitquilltech.com
> >>
> >> This email message is for the sole use of the intended recipient(s) and
> may
> >> contain confidential and privileged information.  Any unauthorized
> review,
> >> use, disclosure, or distribution is prohibited.  If you are not the
> >> intended recipient, please contact the sender by reply email and destroy
> >> all copies of the original message.  Thank you.
> >>
>

Reply via email to