That makes sense to me. I agree that it's probably not very useful to try to share anything in the parser between calcite-server and calcite-babel since calcite-babel will always be a moving target. However, given that calcite-babel is intended to be particularly permissive, it would be great to have a way to run calcite-server DDL tests against calcite-babel.
-- Michael Mior [email protected] Le mer. 2 mai 2018 à 14:34, Shuyi Chen <[email protected]> a écrit : > Yes, that's what's in my mind as well. Server module is kinda of Calcite's > DDL, people that use Calcite directly can just use server module for their > DDL purpose. Other SQL dialect have their own DDL, and in order for them to > leverage Calcite's relational algebra and query planning, the Babel parser > need to be able to parse both DML and DDL of their own dialect. Would that > be clear? > > On Wed, May 2, 2018 at 11:23 AM, Julian Hyde <[email protected]> wrote: > > > The principles are as follows: > > * Server should expose, as DDL, the concepts in Calcite’s framework, no > > more, no less. This includes the ability to define a type if supported by > > Calcite’s type system (RelDataTypeFactory), and the ability to define > > materialized views and lattices. > > * Babel should expose anything in a supported SQL dialect (or rather, > > anything that someone has found time to support). > > > > Server’s specification is relatively fixed, whereas Babel’s specification > > is growing and changing all the time. > > > > Julian > > > > > > > On May 2, 2018, at 10:06 AM, Michael Mior <[email protected]> wrote: > > > > > > Seems logical to me, although I wonder if there's any way we could > easily > > > make the DDL part of the parser modular. At least before going too far > > down > > > the road of implementing DDL in Babel, it would be good to set a clear > > > scope of what will exist in calcite-babel vs. calcite-server. > > > > > > -- > > > Michael Mior > > > [email protected] <mailto:[email protected]> > > > > > > 2018-05-02 12:57 GMT-04:00 Julian Hyde <[email protected] <mailto: > > [email protected]>>: > > > > > >> By the way. We should also figure out how this fits with the project > to > > >> create a lenient parser that can handle any dialect of SQL. I am > calling > > >> that parser “Babel”[1]. That parser will be able to handle BigQuery > > >> dialect, among others. > > >> > > >> Here’s my current thinking. > > >> > > >> I think that Babel should be a new module (a sibling to > calcite-server, > > >> calcite-druid etc.) and its parser will extend the core parser. That > > means > > >> that calcite-babel will not inherit from the DDL parser in the > > >> calcite-server module, nor vice versa. We will probably end up with > two > > >> parsers that are capable of handling DDL, and two sets of AST classes. > > But > > >> I think that is OK, or at least, better than the chaos of trying to > > reuse > > >> too much. At least, the parsers will share 99% of their DNA with the > > core > > >> parser. And we can easily share tests. > > >> > > >> Julian > > >> > > >> [1] https://issues.apache.org/jira/browse/CALCITE-2280 < > > >> https://issues.apache.org/jira/browse/CALCITE-2280 < > > https://issues.apache.org/jira/browse/CALCITE-2280>> > > >> > > >>> On May 1, 2018, at 11:16 PM, Shuyi Chen <[email protected]> wrote: > > >>> > > >>> Hi Anton, thanks a lot for the great questions. > > >>> > > >>> Yes, SqlDataTypeSpec currently only support creating simple SQL > types, > > no > > >>> row/array/map is supported. > > >>> > > >>> CALCITE-2045 adds support for defining custom either simple or row > > types > > >>> through the type DDL, and you should be able to use the UDT in your > > Table > > >>> DDL for complex row type. I think this should be close to what you > > want. > > >>> > > >>> You can extend current type DDL in its current form in BEAM parser > and > > >> add > > >>> support for map and array type, or modify the grammar to tailor your > > need > > >>> to make it BigQuery compatible. All the required change for > supporting > > >> UDT > > >>> in calcite-core should be already done by CALCITE-2045. > > >>> > > >>> As for the big query syntax, I am not sure if it's a good idea to > adopt > > >> it > > >>> in core parser unless there is no SQL equivalent, but if you > implement > > it > > >>> in your extended BEAM parser, it's up to you and that's by design of > > >>> Calcite DDL. > > >>> > > >>> Let me know if it helps. > > >>> > > >>> Thanks > > >>> Shuyi > > >>> > > >>> On Tue, May 1, 2018 at 3:21 PM, Anton Kedin <[email protected] > > > > >>> wrote: > > >>> > > >>>> Hi, > > >>>> > > >>>> We want add support for non-primitive types (ROW, ARRAY, MAP) to > > Apache > > >>>> Beam SQL DDL (based on Calcite DDL extensions). What would be the > best > > >> way > > >>>> to approach this? > > >>>> > > >>>> *Our Use Case:* > > >>>> We want to be able to use DDL to define data sources and sinks for > > Beam > > >>>> pipelines, so that users don't have to wrap SQL into custom code > which > > >>>> configures sources/sinks. > > >>>> > > >>>> *What we have already:* > > >>>> We have a customized CREATE TABLE statement which allows users to > > >> specify > > >>>> the type of the data source, its schema, and data location. The > > >>>> implmentation is based on Calcite DDL extensions. > > >>>> > > >>>> *What we're missing:* > > >>>> We need to be able to define schemas with non-primitive types, e.g. > > >>>> arrays or rows, so that we can correctly describe data sources and > > sinks > > >>>> which supports such types. For example if we want to manipulate data > > in > > >> a > > >>>> stream of JSON objects, we want to be able to describe the JSON > > contents > > >>>> somehow, including arrays or nested objects. Or we would need > similar > > >> types > > >>>> to interact with BigQuery which supports arrays and nested struct > > types. > > >>>> > > >>>> *Problem:* > > >>>> I tried to check if it is possible to extend the parser using the > > >>>> config.fmpp approach, so that we can hook into the Parser.TypeName() > > >>>> <https://github.com/apache/calcite/blob/ > > a5d520df76602d25ed66627f08f5e0 > > >>>> db4d048a77/core/src/main/codegen/templates/Parser.jj#L4439> > > >>>> method and parse the complex types ourselves. But Parser.DataType() > > >>>> <https://github.com/apache/calcite/blob/ > > a5d520df76602d25ed66627f08f5e0 > > >>>> db4d048a77/core/src/main/codegen/templates/Parser.jj#L4377> > > >>>> creates > > >>>> SqlDataTypeSpec only in two specific ways, without ability to extend > > >> it, so > > >>>> even if we parse the typename ourselves, we would not be able to > > >> construct > > >>>> the SqlDataTypeSpec in a way that supports arrays/rows. But even if > we > > >>>> could, looking at SqlDataTypeSpec > > >>>> <https://github.com/apache/calcite/blob/ > > 09be7e74a6a4d1b1c4f640c8e69b5e > > >>>> bdd467d811/core/src/main/java/org/apache/calcite/sql/ > > >>>> SqlDataTypeSpec.java#L327> > > >>>> it seems that it does not support creating arrays or rows as well: > it > > >> calls > > >>>> typeFactory.createSqlType(typename) > > >>>> <https://github.com/apache/calcite/blob/ > > 09be7e74a6a4d1b1c4f640c8e69b5e > > >>>> bdd467d811/core/src/main/java/org/apache/calcite/sql/ > > >>>> SqlDataTypeSpec.java#L350> > > >>>> which > > >>>> only > > >>>> <https://github.com/apache/calcite/blob/ > > f47465236b7650f2280092b708fa39 > > >>>> 062fe79ffd/core/src/main/java/org/apache/calcite/sql/type/ > > >>>> SqlTypeFactoryImpl.java#L49> > > >>>> creates basic types in this call. > > >>>> > > >>>> *Path forward:* > > >>>> It the above is correct, then it appears that we would need to patch > > >>>> Calcite in couple of places to support arrays, rows, and maps in > DDL: > > >>>> - update Parser.jj to support parsing the type definitions for the > > >>>> required types and constructing SqlDataTypeSpec correctly for those > > >> cases; > > >>>> - update SqlDataTypeSpec.java to handle complex types and invoke > > >>>> correct typeFactory interfaces; > > >>>> > > >>>> *Questions:* > > >>>> - does the above sound sane/correct? > > >>>> - is there a similar work already tracked in Calcite somewhere? I > saw > > >>>> something mentioned in CALCITE-2045 > > >>>> <https://issues.apache.org/jira/browse/CALCITE-2045? > > >>>> focusedCommentId=16351203&page=com.atlassian.jira. > > >>>> plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16351203>, > > >>>> but didn't see any tracking Jiras specifically for this work yet; > > >>>> - is there a known/recommended/working syntax for such DDL? If there > > is > > >>>> none, then would it make sense to adopt something similar to > BigQuery > > >>>> STRUCT/ARRAY > > >>>> definition <https://cloud.google.com/bigquery/docs/data-definition- > > >>>> language> > > >>>> ? > > >>>> > > >>>> Thank you, > > >>>> Anton > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> "So you have to trust that the dots will somehow connect in your > > future." > > > > > > > -- > "So you have to trust that the dots will somehow connect in your future." >
