Gavin, thanks for sharing. I'm not so sure you'll find an alternative to 
Substrait, at least one that isn't even more nascent or one that's very tied to 
a particular language, so perhaps it might be better to get involved in 
Substrait and see if it suits your needs? Convincing a team to try something 
new can be hard, though, and it is somewhat of a moving target - but Flight SQL 
is in a similar spot, I think, as it's still getting enhancements.

Your project does sound interesting - basically, it sounds like a tabular data 
storage service with query pushdown?

On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote:
> James, I agree that you could use JSON but that feels a bit hacky 
> (mis-use
> of the paradigm). Instead, I'd really like to do something like David is
> suggesting: support Substrait as an alternative to a SQL string.
> Something like this:
> https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc
>
> It would be great if someone wanted to pick this up. It would be a nice
> enhancement to FlightSQL (and provide a structured way to express
> operations).
>
>
>
> On Thu, Mar 3, 2022 at 4:56 PM James Duong <jam...@bitquilltech.com.invalid>
> wrote:
>
>> In the same way that you could write an ODBC driver that takes in text
>> that's not SQL, you could write a Flight SQL server that takes in text
>> that's JSON.
>> Flight SQL doesn't parse the query, so you could create commands that are
>> just JSON text.
>>
>> Is that the only bit you need, Gavin?
>>
>> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray <ray.gavi...@gmail.com> wrote:
>>
>> > I am enthusiastic about Substrait and have followed it's progress eagerly
>> > =D
>> >
>> > When I presented it as a tentative option, there were reservations
>> because
>> > of the project/spec being young and the functionality still being
>> > fleshed out.
>> > I think if I were having this conversation in say, 8-16 months, it would
>> > have been an easy choice, no doubt.
>> >
>> > On a public mailing list (and I can share more details in private if
>> you're
>> > curious), the gist of it is this:
>> >
>> > Some well-defined/backed-by-mature tech solution for expressing data
>> > compute operations between services would be a useful thing to have
>> > (Especially if it's language-agnostic)
>> >
>> > The goal is for an "implementing service" to have:
>> > - An introspectable schema (IE, "describe yourself to me")
>> > - A query/operation execution endpoint (IE: "perform this operation on
>> your
>> > data")
>> >
>> > With FlightSQL this is possible I believe, but it requires the operation
>> to
>> > be expressed as a SQL string which isn't ideal.
>> >
>> > Working with some programmatic, structured object that has the same
>> > semantics ("Logical Plan", or whatnot) as a SQL query would have, would
>> be
>> > a better experience
>> > (Jacques is on to something here!)
>> >
>> > This interface between services would be somewhat the equivalent of an
>> > "SDK", so it would be nice to have a strongly-typed library for
>> expressing
>> > and building-up query/data-compute ops.
>> >
>> >
>> > On Thu, Mar 3, 2022 at 3:17 PM David Li <lidav...@apache.org> wrote:
>> >
>> > > You probably want Substrait: https://substrait.io/
>> > >
>> > > Which is being worked on by several people, including Arrow community
>> > > members.
>> > >
>> > > It might be interesting to generalize Flight SQL to include support for
>> > > Substrait. I'm curious what your application, if you're able to share
>> > more.
>> > >
>> > > -David
>> > >
>> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote:
>> > > > Hiya,
>> > > >
>> > > > I am drafting a proposal for a way to enable services to express data
>> > > > compute operations to each other.
>> > > >
>> > > > However I think it'll be difficult to get buy-in if the only
>> > > representation
>> > > > for queries is as SQL strings.
>> > > >
>> > > > Is there any kind of lower-level API that can be used to express
>> > > operations?
>> > > >
>> > > > IE instead of "SELECT name FROM user"
>> > > >
>> > > > A structured representation like:
>> > > > {
>> > > >   "op": "query",
>> > > >   "schema": "user",
>> > > >   "project": ["name"]
>> > > > }
>> > > >
>> > > > Or maybe this is a bad idea/doesn't make sense?
>> > > >
>> > > > Thank you =)
>> > >
>> >
>>
>>
>> --
>>
>> *James Duong*
>> Lead Software Developer
>> Bit Quill Technologies Inc.
>> Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> https://www.bitquilltech.com
>>
>> This email message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information.  Any unauthorized review,
>> use, disclosure, or distribution is prohibited.  If you are not the
>> intended recipient, please contact the sender by reply email and destroy
>> all copies of the original message.  Thank you.
>>

Reply via email to