Just take a look at https://issues.apache.org/jira/browse/CALCITE-2280 which introduced Babel parser in Calcite to support varied dialects, this may be an easier way to support BigQuery syntax. @Rui do you notice any big difference between Calcite engine and ZetaSQL, like parsing, optimization? If that's the case, it make sense to build the alternative switch in Beam side.
On Sun, Aug 4, 2019 at 4:47 PM Rui Wang <ruw...@google.com> wrote: > Mingmin - it sounds like an awesome idea to translate from SparkSQL. It's > even more exciting to know if we could translate Spark Structured Streaming > code by a similar way, which enables existing Spark SQL/Structure Streaming > pipelines run on Beam. > > Reuven - Thanks for bringing it up. I tried to search dev@calcite and > only found[1]. From that thread, I see that adding ZetaSQL to Calcite > itself is still a discussion. I am also looking for if anyone knows more > progress on this work than the thread. > > > [1]: > http://mail-archives.apache.org/mod_mbox/calcite-dev/201905.mbox/%3CCAMj=j=-sPWgxzAgusnx8OYvYDYDcDY=dupe6poytrxhjri9...@mail.gmail.com%3E > > -Rui > > On Sun, Aug 4, 2019 at 3:54 PM Reuven Lax <re...@google.com> wrote: > >> I hear rumours that the Calcite project is planning on adding a zeta-SQL >> compatible parser to Calcite itself, in which case there will be a Java >> parser we can use as well. Does anyone know if this work is still going on? >> >> On Sat, Aug 3, 2019 at 8:41 PM Manu Zhang <owenzhang1...@gmail.com> >> wrote: >> >>> A question to the community, does the size of the change require any >>>> process besides the usual PR reviews? >>>> >>> >>> I think so. This is a big change and has come as kind of a surprise >>> (sorry if I've missed previous discussions). >>> >>> Rui, could you explain more on how things will play out between BeamSQL >>> and ZetaSQL (A design doc including the pluggable interface would be >>> perfect). From GitHub, ZetaSQL is mainly in C++ so what you are doing is a >>> port or a connector to ZetaSQL ? Do we need to depend on >>> https://github.com/google/zetasql ? ZetaSQL looks interesting but I >>> could barely find any doc for end users. >>> >>> Also, I'd prefer the PR to be split into two, one for the pluggable >>> interface and one for the ZetaSQL. >>> >>> Thanks, >>> Manu >>> >>> >>> >>> On Sat, Aug 3, 2019 at 10:06 AM Ahmet Altay <al...@google.com> wrote: >>> >>>> Thank you Rui for the heads up. >>>> >>>> A question to the community, does the size of the change require any >>>> process besides the usual PR reviews? >>>> >>>> On Fri, Aug 2, 2019 at 10:23 AM Rui Wang <ruw...@google.com> wrote: >>>> >>>>> Hi community, >>>>> >>>>> I have been working on supporting ZetaSQL[1] as a SQL dialect in >>>>> BeamSQL. ZetaSQL is a SQL analyzer open sourced by Google. Here is >>>>> ZetaSQL's documentation[2]. >>>>> >>>>> Birfely, the design of integrating ZetaSQL with BeamSQL is, I made a >>>>> plugable query planner interface in BeamSQL, and we can easily plug in a >>>>> new planner[3] (in my case, ZetaSQL planner). Actually anyone can add new >>>>> planners by this way (e.g. PostgreSQL dialect). >>>>> >>>>> I want to contribute ZetaSQL planner and its related code(~10k) to >>>>> Beam repo(#9210 <https://github.com/apache/beam/pull/9210>). This >>>>> contribution barely touch existing Beam code (because the idea is plugable >>>>> planner). >>>>> >>>>> >>>>> *Acknowledgement* >>>>> Thanks to all the people who provided help during Beam ZetaSQL >>>>> development: Matthew Brown, Brian Hulette, Andrew Pilloud, Kenneth >>>>> Knowles, >>>>> Anton Kedin and Mikhail Gryzykhin. This list is not exhausted and also >>>>> thanks to contributions which are not listed. >>>>> >>>>> >>>>> [1]: https://github.com/google/zetasql >>>>> [2]: https://github.com/google/zetasql/tree/master/docs >>>>> [3]: >>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java >>>>> >>>>> >>>>> -Rui >>>>> >>>> -- ---- Mingmin