Thanks to highlight the parts of types/operators/functions/..., that does
make things more complicated. +1 that as a short/middle term solution, the
proposal is reasonable. We could follow up in future to handle it in
Calcite Babel if possible.

Mingmin

On Tue, Aug 6, 2019 at 3:57 PM Rui Wang <ruw...@google.com> wrote:

> Hi Mingmin,
>
> Honestly I don't have an answer to it: a SQL dialect is complicated and I
> don't have enough understanding on Calcite (Calcite has a big repo). Based
> on my read from CALCITE-2280
> <https://issues.apache.org/jira/browse/CALCITE-2280>, the closer to
> standard sql that a dialect is, the less blockers that we will have to
> support this dialect in Calcite babel parser.
>
> However, this is a good question, which raises a good aspect that I found
> people usually ignore: supporting a SQL dialect is not only support a type
> of syntax. It also includes data types, built-in sql functions, operators
> and many other stuff.
>
> I especially found the following incompatibilities between Calcite and
> ZetaSQL during the development:
> 1. Calcite does not support Struct/Row type well because Calcite flattens
> Rows when reading from tables by adding an extra Projection on top of
> tables.
> 2. I had trouble in supporting DATETIME(or timestamp without time zone)
> type.
> 3. Huge incompatibilities on SQL functions. E.g. return type is different
> for AVG(long), and many many more.
> 4. I am not sure if Calcite has the same set of type casting rules as
> BigQuery(my impression is there are differences).
>
>
> I would say in the short/mid term, it's much easier to use logical plan as
> IR to implement another SQL dialect for BeamSQL (Linkedin has
> similar practice, see their blog post
> <https://engineering.linkedin.com/blog/2019/01/bridging-offline-and-nearline-computations-with-apache-calcite>
> ).
>
> For the longer term, it would be interesting to see how we can add
> BigQuery syntax (plus its data types and sql functions) to Calcite babel
> parser.
>
>
>
> -Rui
>
>
> On Tue, Aug 6, 2019 at 2:49 PM Mingmin Xu <mingm...@gmail.com> wrote:
>
>> Just take a look at https://issues.apache.org/jira/browse/CALCITE-2280
>> which introduced Babel parser in Calcite to support varied dialects, this
>> may be an easier way to support BigQuery syntax. @Rui do you notice any big
>> difference between Calcite engine and ZetaSQL, like parsing, optimization?
>> If that's the case, it make sense to build the alternative switch in Beam
>> side.
>>
>> On Sun, Aug 4, 2019 at 4:47 PM Rui Wang <ruw...@google.com> wrote:
>>
>>> Mingmin - it sounds like an awesome idea to translate from SparkSQL.
>>> It's even more exciting to know if we could translate Spark
>>> Structured Streaming code by a similar way, which enables existing Spark
>>> SQL/Structure Streaming pipelines run on Beam.
>>>
>>> Reuven - Thanks for bringing it up. I tried to search dev@calcite and
>>> only found[1]. From that thread, I see that adding ZetaSQL to Calcite
>>> itself is still a discussion. I am also looking for if anyone knows more
>>> progress on this work than the thread.
>>>
>>>
>>> [1]:
>>> http://mail-archives.apache.org/mod_mbox/calcite-dev/201905.mbox/%3CCAMj=j=-sPWgxzAgusnx8OYvYDYDcDY=dupe6poytrxhjri9...@mail.gmail.com%3E
>>>
>>> -Rui
>>>
>>> On Sun, Aug 4, 2019 at 3:54 PM Reuven Lax <re...@google.com> wrote:
>>>
>>>> I hear rumours that the Calcite project is planning on adding a
>>>> zeta-SQL compatible parser to Calcite itself, in which case there will be a
>>>> Java parser we can use as well. Does anyone know if this work is still
>>>> going on?
>>>>
>>>> On Sat, Aug 3, 2019 at 8:41 PM Manu Zhang <owenzhang1...@gmail.com>
>>>> wrote:
>>>>
>>>>> A question to the community, does the size of the change require any
>>>>>> process besides the usual PR reviews?
>>>>>>
>>>>>
>>>>> I think so. This is a big change and has come as kind of a surprise
>>>>> (sorry if I've missed previous discussions).
>>>>>
>>>>> Rui, could you explain more on how things will play out between
>>>>> BeamSQL and ZetaSQL (A design doc including the pluggable interface would
>>>>> be perfect). From GitHub, ZetaSQL is mainly in C++ so what you are doing 
>>>>> is
>>>>> a port or a connector to ZetaSQL ? Do we need to depend on
>>>>> https://github.com/google/zetasql ? ZetaSQL looks interesting but I
>>>>> could barely find any doc for end users.
>>>>>
>>>>> Also, I'd prefer the PR to be split into two, one for the pluggable
>>>>> interface and one for the ZetaSQL.
>>>>>
>>>>> Thanks,
>>>>> Manu
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Aug 3, 2019 at 10:06 AM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>> Thank you Rui for the heads up.
>>>>>>
>>>>>> A question to the community, does the size of the change require any
>>>>>> process besides the usual PR reviews?
>>>>>>
>>>>>> On Fri, Aug 2, 2019 at 10:23 AM Rui Wang <ruw...@google.com> wrote:
>>>>>>
>>>>>>> Hi community,
>>>>>>>
>>>>>>> I have been working on supporting ZetaSQL[1] as a SQL dialect in
>>>>>>> BeamSQL. ZetaSQL is a SQL analyzer open sourced by Google. Here is
>>>>>>> ZetaSQL's documentation[2].
>>>>>>>
>>>>>>> Birfely, the design of integrating ZetaSQL with BeamSQL is, I made a
>>>>>>> plugable query planner interface in BeamSQL, and we can easily plug in a
>>>>>>> new planner[3] (in my case, ZetaSQL planner). Actually anyone can add 
>>>>>>> new
>>>>>>> planners by this way (e.g. PostgreSQL dialect).
>>>>>>>
>>>>>>> I want to contribute ZetaSQL planner and its related code(~10k) to
>>>>>>> Beam repo(#9210 <https://github.com/apache/beam/pull/9210>). This
>>>>>>> contribution barely touch existing Beam code (because the idea is 
>>>>>>> plugable
>>>>>>> planner).
>>>>>>>
>>>>>>>
>>>>>>> *Acknowledgement*
>>>>>>> Thanks to all the people who provided help during Beam ZetaSQL
>>>>>>> development: Matthew Brown, Brian Hulette, Andrew Pilloud, Kenneth 
>>>>>>> Knowles,
>>>>>>> Anton Kedin and Mikhail Gryzykhin. This list is not exhausted and also
>>>>>>> thanks to contributions which are not listed.
>>>>>>>
>>>>>>>
>>>>>>> [1]: https://github.com/google/zetasql
>>>>>>> [2]: https://github.com/google/zetasql/tree/master/docs
>>>>>>> [3]:
>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/QueryPlanner.java
>>>>>>>
>>>>>>>
>>>>>>> -Rui
>>>>>>>
>>>>>>
>>
>> --
>> ----
>> Mingmin
>>
>

-- 
----
Mingmin

Reply via email to