Got it, thank you David!
I started prototyping the implementation last night, hopefully I will make
some good progress and have something basic functioning soon.

RE: The metadata thing -- I think both Calcite and Teiid have solid
interfaces for defining what capabilities a datasource has.
https://github.com/teiid/teiid/blob/8e9057a46be009d68b2d67701781f1f8c175baa7/api/src/main/java/org/teiid/translator/ExecutionFactory.java#L349-L1528

It's probably not possible to make something universal, but it seems like
you could get pretty close to most common functionality/capabilities


On Sat, Mar 5, 2022 at 11:48 PM Kyle Porter <ky...@bitquilltech.com.invalid>
wrote:

> Yes, we should, where possible, avoid any one of metadata. This is where
> other standards fail in that applications must be custom built for each
> data source, if we standardize the metadata then applications can at least
> be built to adapt.
>
> On Sat., Mar. 5, 2022, 6:54 p.m. David Li, <lidav...@apache.org> wrote:
>
> > Yes, GetSqlInfo reserves a range of metadata IDs for Flight SQL's use, so
> > the application can use others for its own purposes. That said if they
> seem
> > commonly applicable maybe we should try to standardize them.
> >
> > I think what you are doing should be reasonable. You may not need _all_
> of
> > the capabilities in Flight SQL for this (e.g. all the various metadata
> > calls, or prepared statements, perhaps) but I don't see why it wouldn't
> > work for you.
> >
> > On Fri, Mar 4, 2022, at 19:03, Gavin Ray wrote:
> > > To touch on the question about supported features -- is it possible to
> > > advertise arbitrary/custom "capabilites" in GetSqlInfo?
> > > Say that you want to represent some set of behaviors that FlightSQL
> > > services can support.
> > >
> > > Stuff like "Supports grouping by multiple distinct aggregates",
> "Supports
> > > self-joins on aliased tables" etc
> > > This is going to be unique to each implementation, but I couldn't
> > determine
> > > whether there was a way to express arbitrary capabilities
> > >
> > > Also, in case it's helpful I put together an ASCII diagram of what I'm
> > > trying to do with FlightSQL
> > > If anyone has a moment, would appreciate input on whether it's
> feasible/a
> > > good idea
> > >
> > > https://pastebin.com/raw/VF2r0F3f
> > >
> > > Thank you =)
> > >
> > >
> > > On Fri, Mar 4, 2022 at 2:37 PM David Li <lidav...@apache.org> wrote:
> > >
> > >> We could also add say CommandSubstraitQuery as a distinct message, and
> > >> older servers would just reject it as an unknown request type.
> > >>
> > >> -David
> > >>
> > >> On Fri, Mar 4, 2022, at 17:01, Micah Kornfield wrote:
> > >> >>
> > >> >> 1. How does a server report that it supports each command type?
> > Initial
> > >> >> thought is a property in GetSqlInfo.
> > >> >
> > >> >
> > >> > This sounds reasonable.
> > >> >
> > >> >
> > >> >> What happens to client code written prior to changing the command
> > type
> > >> >> to be a oneOf field? Same for servers.
> > >> >
> > >> >
> > >> > It is transparent from older clients (I'm 99% sure the wire protocol
> > >> > doesn't change).  Servers is a little harder.  The one saving grace
> > is I
> > >> > don't think an empty/not-present SQL string would be something most
> > >> servers
> > >> > could handle, so they would probably error with something that while
> > >> > not-obvious would give a clue to the clients (but hopefully this
> would
> > >> be a
> > >> > non-issue because the capabilities would be checked for clients
> > wishing
> > >> to
> > >> > to use this feature first).
> > >> >
> > >> > -Micah
> > >> >
> > >> > On Fri, Mar 4, 2022 at 1:50 PM James Duong <jam...@bitquilltech.com
> > >> .invalid>
> > >> > wrote:
> > >> >
> > >> >> It sounds like an interesting and useful project to use Subtstrait
> > as an
> > >> >> alternative to SQL strings.
> > >> >>
> > >> >> Important aspects to spec out are:
> > >> >> 1. How does a server report that it supports each command type?
> > Initial
> > >> >> thought is a property in GetSqlInfo.
> > >> >> 2. What happens to client code written prior to changing the
> command
> > >> type
> > >> >> to be a oneOf field? Same for servers.
> > >> >> More generally, how should backward compatibility work, and what
> > should
> > >> >> happen if a client sends an unsupported
> > >> >> command type to a server.
> > >> >> 3. Should inputs to catalog RPC calls also accept Substrait
> > structures?
> > >> >>
> > >> >> On Thu, Mar 3, 2022 at 11:00 PM Gavin Ray <ray.gavi...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> > @James Duong <jam...@bitquilltech.com>
> > >> >> >
> > >> >> > You are absolutely right, I realized this and confirmed whether
> > this
> > >> >> > would be possible with Jacques to double-check.
> > >> >> > It would amount to what I might call "dollar-store Substrait."
> It's
> > >> not
> > >> >> > elegant or a good solution, but definitely presents a good
> > duct-tape
> > >> hack
> > >> >> > and is a crafty idea.
> > >> >> >
> > >> >> > I agree with Jacques -- when you think about FlightSQL, what you
> > are
> > >> >> > attempting with a query isn't necessarily SQL, but a general
> > >> data-compute
> > >> >> > operation.
> > >> >> > SQL just so happens to be a fairly universal way to express them,
> > >> with an
> > >> >> > ANSI standard, but FlightSQL doesn't recognize any particular
> > subset
> > >> of
> > >> >> it
> > >> >> > and for all intents and purposes it doesn't matter what the
> > operation
> > >> >> > string contains.
> > >> >> >
> > >> >> > Substrait would make a fantastic logical next-feature because
> it's
> > >> >> > targeted as a specification for expressing relational algebra and
> > >> >> > data-compute operations
> > >> >> > This more-or-less equates to SQL strings (in my mind at least)
> > with a
> > >> >> much
> > >> >> > better toolkit and Dev UX. If there is anything I can do to help
> > move
> > >> >> this
> > >> >> > forward, please let me know because I am extremely motivated to
> do
> > so.
> > >> >> >
> > >> >> > @David Li <git...@lidavidm.me>
> > >> >> >
> > >> >> > Also agreed. Substrait is put together by folks much smarter than
> > >> myself,
> > >> >> > and if I had to hedge my bets, I'd put money on it being the
> > future of
> > >> >> > data-compute interop.
> > >> >> > I would love nothing more than to adopt this technology and push
> it
> > >> >> along.
> > >> >> >
> > >> >> > Your project does sound interesting - basically, it sounds like a
> > >> tabular
> > >> >> >> data storage service with query pushdown?
> > >> >> >>
> > >> >> >
> > >> >> > Yeah this is more or less the details of it (my personal email,
> > with
> > >> >> > discretion assumed, is always open)
> > >> >> >
> > >> >> > Imagine an environment where a backend wants to advertise some
> > kind of
> > >> >> > schema/data catalog
> > >> >> >
> > >> >> > And then a central service introspects these backends, and
> > dynamically
> > >> >> > generates an API from the data catalogues/schemas, where requests
> > get
> > >> >> > proxied to the underlying backend service for each schema to
> > actually
> > >> be
> > >> >> > executed
> > >> >> >
> > >> >> > In text, the flow would look something like:
> > >> >> >
> > >> >> >
> > >> >> >        <----> Data Provider Backend 0
> > >> >> > Client <-----> Central Service <---> Generated API <---->
> > >> Data-Provider
> > >> >> > Backend 1
> > >> >> >
> > >> >> >        <----> Data Provider Backend 2
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Thu, Mar 3, 2022 at 5:52 PM David Li <lidav...@apache.org>
> > wrote:
> > >> >> >
> > >> >> >> Gavin, thanks for sharing. I'm not so sure you'll find an
> > >> alternative to
> > >> >> >> Substrait, at least one that isn't even more nascent or one
> that's
> > >> very
> > >> >> >> tied to a particular language, so perhaps it might be better to
> > get
> > >> >> >> involved in Substrait and see if it suits your needs?
> Convincing a
> > >> team
> > >> >> to
> > >> >> >> try something new can be hard, though, and it is somewhat of a
> > moving
> > >> >> >> target - but Flight SQL is in a similar spot, I think, as it's
> > still
> > >> >> >> getting enhancements.
> > >> >> >>
> > >> >> >> Your project does sound interesting - basically, it sounds like
> a
> > >> >> tabular
> > >> >> >> data storage service with query pushdown?
> > >> >> >>
> > >> >> >> On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote:
> > >> >> >> > James, I agree that you could use JSON but that feels a bit
> > hacky
> > >> >> >> > (mis-use
> > >> >> >> > of the paradigm). Instead, I'd really like to do something
> like
> > >> David
> > >> >> is
> > >> >> >> > suggesting: support Substrait as an alternative to a SQL
> string.
> > >> >> >> > Something like this:
> > >> >> >> >
> > >> >> >>
> > >> >>
> > >>
> >
> https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc
> > >> >> >> >
> > >> >> >> > It would be great if someone wanted to pick this up. It would
> > be a
> > >> >> nice
> > >> >> >> > enhancement to FlightSQL (and provide a structured way to
> > express
> > >> >> >> > operations).
> > >> >> >> >
> > >> >> >> >
> > >> >> >> >
> > >> >> >> > On Thu, Mar 3, 2022 at 4:56 PM James Duong <
> > >> jam...@bitquilltech.com
> > >> >> >> .invalid>
> > >> >> >> > wrote:
> > >> >> >> >
> > >> >> >> >> In the same way that you could write an ODBC driver that
> takes
> > in
> > >> >> text
> > >> >> >> >> that's not SQL, you could write a Flight SQL server that
> takes
> > in
> > >> >> text
> > >> >> >> >> that's JSON.
> > >> >> >> >> Flight SQL doesn't parse the query, so you could create
> > commands
> > >> that
> > >> >> >> are
> > >> >> >> >> just JSON text.
> > >> >> >> >>
> > >> >> >> >> Is that the only bit you need, Gavin?
> > >> >> >> >>
> > >> >> >> >> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray <
> > ray.gavi...@gmail.com>
> > >> >> >> wrote:
> > >> >> >> >>
> > >> >> >> >> > I am enthusiastic about Substrait and have followed it's
> > >> progress
> > >> >> >> eagerly
> > >> >> >> >> > =D
> > >> >> >> >> >
> > >> >> >> >> > When I presented it as a tentative option, there were
> > >> reservations
> > >> >> >> >> because
> > >> >> >> >> > of the project/spec being young and the functionality still
> > >> being
> > >> >> >> >> > fleshed out.
> > >> >> >> >> > I think if I were having this conversation in say, 8-16
> > months,
> > >> it
> > >> >> >> would
> > >> >> >> >> > have been an easy choice, no doubt.
> > >> >> >> >> >
> > >> >> >> >> > On a public mailing list (and I can share more details in
> > >> private
> > >> >> if
> > >> >> >> >> you're
> > >> >> >> >> > curious), the gist of it is this:
> > >> >> >> >> >
> > >> >> >> >> > Some well-defined/backed-by-mature tech solution for
> > expressing
> > >> >> data
> > >> >> >> >> > compute operations between services would be a useful thing
> > to
> > >> have
> > >> >> >> >> > (Especially if it's language-agnostic)
> > >> >> >> >> >
> > >> >> >> >> > The goal is for an "implementing service" to have:
> > >> >> >> >> > - An introspectable schema (IE, "describe yourself to me")
> > >> >> >> >> > - A query/operation execution endpoint (IE: "perform this
> > >> operation
> > >> >> >> on
> > >> >> >> >> your
> > >> >> >> >> > data")
> > >> >> >> >> >
> > >> >> >> >> > With FlightSQL this is possible I believe, but it requires
> > the
> > >> >> >> operation
> > >> >> >> >> to
> > >> >> >> >> > be expressed as a SQL string which isn't ideal.
> > >> >> >> >> >
> > >> >> >> >> > Working with some programmatic, structured object that has
> > the
> > >> same
> > >> >> >> >> > semantics ("Logical Plan", or whatnot) as a SQL query would
> > >> have,
> > >> >> >> would
> > >> >> >> >> be
> > >> >> >> >> > a better experience
> > >> >> >> >> > (Jacques is on to something here!)
> > >> >> >> >> >
> > >> >> >> >> > This interface between services would be somewhat the
> > >> equivalent of
> > >> >> >> an
> > >> >> >> >> > "SDK", so it would be nice to have a strongly-typed library
> > for
> > >> >> >> >> expressing
> > >> >> >> >> > and building-up query/data-compute ops.
> > >> >> >> >> >
> > >> >> >> >> >
> > >> >> >> >> > On Thu, Mar 3, 2022 at 3:17 PM David Li <
> lidav...@apache.org
> > >
> > >> >> wrote:
> > >> >> >> >> >
> > >> >> >> >> > > You probably want Substrait: https://substrait.io/
> > >> >> >> >> > >
> > >> >> >> >> > > Which is being worked on by several people, including
> Arrow
> > >> >> >> community
> > >> >> >> >> > > members.
> > >> >> >> >> > >
> > >> >> >> >> > > It might be interesting to generalize Flight SQL to
> include
> > >> >> >> support for
> > >> >> >> >> > > Substrait. I'm curious what your application, if you're
> > able
> > >> to
> > >> >> >> share
> > >> >> >> >> > more.
> > >> >> >> >> > >
> > >> >> >> >> > > -David
> > >> >> >> >> > >
> > >> >> >> >> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote:
> > >> >> >> >> > > > Hiya,
> > >> >> >> >> > > >
> > >> >> >> >> > > > I am drafting a proposal for a way to enable services
> to
> > >> >> express
> > >> >> >> data
> > >> >> >> >> > > > compute operations to each other.
> > >> >> >> >> > > >
> > >> >> >> >> > > > However I think it'll be difficult to get buy-in if the
> > only
> > >> >> >> >> > > representation
> > >> >> >> >> > > > for queries is as SQL strings.
> > >> >> >> >> > > >
> > >> >> >> >> > > > Is there any kind of lower-level API that can be used
> to
> > >> >> express
> > >> >> >> >> > > operations?
> > >> >> >> >> > > >
> > >> >> >> >> > > > IE instead of "SELECT name FROM user"
> > >> >> >> >> > > >
> > >> >> >> >> > > > A structured representation like:
> > >> >> >> >> > > > {
> > >> >> >> >> > > >   "op": "query",
> > >> >> >> >> > > >   "schema": "user",
> > >> >> >> >> > > >   "project": ["name"]
> > >> >> >> >> > > > }
> > >> >> >> >> > > >
> > >> >> >> >> > > > Or maybe this is a bad idea/doesn't make sense?
> > >> >> >> >> > > >
> > >> >> >> >> > > > Thank you =)
> > >> >> >> >> > >
> > >> >> >> >> >
> > >> >> >> >>
> > >> >> >> >>
> > >> >> >> >> --
> > >> >> >> >>
> > >> >> >> >> *James Duong*
> > >> >> >> >> Lead Software Developer
> > >> >> >> >> Bit Quill Technologies Inc.
> > >> >> >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
> > >> >> >> >> https://www.bitquilltech.com
> > >> >> >> >>
> > >> >> >> >> This email message is for the sole use of the intended
> > >> recipient(s)
> > >> >> >> and may
> > >> >> >> >> contain confidential and privileged information.  Any
> > unauthorized
> > >> >> >> review,
> > >> >> >> >> use, disclosure, or distribution is prohibited.  If you are
> not
> > >> the
> > >> >> >> >> intended recipient, please contact the sender by reply email
> > and
> > >> >> >> destroy
> > >> >> >> >> all copies of the original message.  Thank you.
> > >> >> >> >>
> > >> >> >>
> > >> >> >
> > >> >>
> > >> >> --
> > >> >>
> > >> >> *James Duong*
> > >> >> Lead Software Developer
> > >> >> Bit Quill Technologies Inc.
> > >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
> > >> >> https://www.bitquilltech.com
> > >> >>
> > >> >> This email message is for the sole use of the intended recipient(s)
> > and
> > >> may
> > >> >> contain confidential and privileged information.  Any unauthorized
> > >> review,
> > >> >> use, disclosure, or distribution is prohibited.  If you are not the
> > >> >> intended recipient, please contact the sender by reply email and
> > destroy
> > >> >> all copies of the original message.  Thank you.
> > >> >>
> > >>
> >
>

Reply via email to