I am enthusiastic about Substrait and have followed it's progress eagerly =D
When I presented it as a tentative option, there were reservations because of the project/spec being young and the functionality still being fleshed out. I think if I were having this conversation in say, 8-16 months, it would have been an easy choice, no doubt. On a public mailing list (and I can share more details in private if you're curious), the gist of it is this: Some well-defined/backed-by-mature tech solution for expressing data compute operations between services would be a useful thing to have (Especially if it's language-agnostic) The goal is for an "implementing service" to have: - An introspectable schema (IE, "describe yourself to me") - A query/operation execution endpoint (IE: "perform this operation on your data") With FlightSQL this is possible I believe, but it requires the operation to be expressed as a SQL string which isn't ideal. Working with some programmatic, structured object that has the same semantics ("Logical Plan", or whatnot) as a SQL query would have, would be a better experience (Jacques is on to something here!) This interface between services would be somewhat the equivalent of an "SDK", so it would be nice to have a strongly-typed library for expressing and building-up query/data-compute ops. On Thu, Mar 3, 2022 at 3:17 PM David Li <lidav...@apache.org> wrote: > You probably want Substrait: https://substrait.io/ > > Which is being worked on by several people, including Arrow community > members. > > It might be interesting to generalize Flight SQL to include support for > Substrait. I'm curious what your application, if you're able to share more. > > -David > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote: > > Hiya, > > > > I am drafting a proposal for a way to enable services to express data > > compute operations to each other. > > > > However I think it'll be difficult to get buy-in if the only > representation > > for queries is as SQL strings. > > > > Is there any kind of lower-level API that can be used to express > operations? > > > > IE instead of "SELECT name FROM user" > > > > A structured representation like: > > { > > "op": "query", > > "schema": "user", > > "project": ["name"] > > } > > > > Or maybe this is a bad idea/doesn't make sense? > > > > Thank you =) >