FYI, I talked to Julian earlier this week and he will be adding Beam to the powered by page since he has a doc for generating the image with the logos.
-- Michael Mior mm...@apache.org 2018-04-12 4:06 GMT-04:00 Shuyi Chen <suez1...@gmail.com>: > @Andrew, great to see BEAM is also using Calcite streaming SQL. Maybe you > can help adding an entry in the Calcite powered_by page > <https://calcite.apache.org/docs/powered_by.html> for BEAM by editing > site/_docs/powered_by.md > <https://github.com/apache/calcite/pull/657#diff- > 1aa810bc92051b46555ee4caf24bd6c3>. > Also, > can you explain a bit more on how you plan to use streaming SQL to > transform > arbitrary JSON objects with and w/o the AS STRUCT syntax? > For adding DDL in BEAM, please take a look at the server module and the > TYPE DDL (https://issues.apache.org/jira/browse/CALCITE-2045) I am adding. > Let me know if you have any comments and need any help. > > @Rong, I think it's better to conform with the ROW SQL standard, and add > new grammar to handle named struct construction. > This should work with the TYPE DDL, and we should be able to CAST the > created STRUCT as a custom type defined using DDL. > > @Julian, thanks for the suggestions. I think the AS STRUCT addition should > do it. > > Shuyi > > On Mon, Apr 9, 2018 at 4:36 AM, Julian Hyde <jhyde.apa...@gmail.com> > wrote: > > > For what it’s worth, ROW is standard SQL. If it does what you need, we > > should use it. > > > > Reading your case quickly, I perceived that you needed a concise way to > > assign field names, and AS STRUCT seemed to do that. > > > > But staying within the standard is always preferred. BigQuery isn’t > always > > good at that. > > > > Julian > > > > > On Apr 5, 2018, at 10:18, Rong Rong <walter...@gmail.com> wrote: > > > > > > Thanks for the fantastic proposal @shuyi. I think the STRUCT idea is > > great > > > considering ROW is not standard SQL either. As a user of calcite I > have a > > > couple questions. > > > > > > Since ROW constructor is so similar with STRUCT, would it be a good > idea > > to > > > consolidate the two syntax? Or have a clear distinction between? > > > > > > Whats the relationship going forward with DDL, for example > CALCITE-2045. > > > DDL seems more flexible in terms of defining the structure not just on > > > field names but also field types. Maybe @andrew can share more on the > use > > > cases on calcite on beam steaming integration? > > > > > > Thanks, > > > Rong > > > > > > > > > On Thu, Apr 5, 2018, 10:05 AM Andrew Pilloud > <apill...@google.com.invalid > > > > > > wrote: > > > > > >> As a user of Calcite working on adding streaming SQL to Apache Beam > this > > >> sounds like a fantastic proposal. Our initial goal is to be able to > run > > SQL > > >> queries that transform arbitrary JSON objects. Without this syntax > > objects > > >> must be flattened when they pass through the transform. Is this > > something > > >> that might make it into 1.17? > > >> > > >> We have also had some discussion about adding DDL to Beam so a user > can > > >> describe the schema of a stream of JSON in pure SQL. Our current > though > > is > > >> to use Big Query compatible STRUCT and ARRAY syntax. Big Query is a > > popular > > >> sink for our users. Syntax compatible with Big Query would be a big > plus > > >> for us. > > >> > > >> Andrew > > >> > > >>> On Thu, Apr 5, 2018 at 12:43 AM Shuyi Chen <suez1...@gmail.com> > wrote: > > >>> > > >>> @Michael, @Albert, yes, I dont think it is SQL standard. But I think > > it's > > >>> very useful in the context of streaming SQL, e.g. Flink SQL, where > the > > >>> sinks can be a database or endpoints with defined protobuf/thrift > > schema. > > >>> They usually have complex structure. Supporting complex structure in > > SQL > > >>> output will make it much easier to write to different sinks with > > >> predefined > > >>> schemas in a unified way, > > >>> > > >>> @julian, that's great suggestion, I think instead of extending the > ROW > > >>> constructor, which is not SQL standard, adding a new extension might > be > > >> the > > >>> right way to go. Looking at the STRUCT big query syntax, we can > > implement > > >>> something like the following: > > >>> > > >>> SELECT STRUCT(a as first_name, b as last_name, STRUCT(c as zip code, > d > > as > > >>> street, e as state) as address) as record FROM example_table > > >>> > > >>> On Wed, Apr 4, 2018 at 5:51 PM, Julian Hyde <jhyde.apa...@gmail.com> > > >>> wrote: > > >>> > > >>>> If I recall correctly, Google BigQuery has SELECT AS STRUCT. It’s > not > > >>>> standard, but if it does what you need we could consider adopting > that > > >>>> syntax. > > >>>> > > >>>> Julian > > >>>> > > >>>>> On Apr 4, 2018, at 10:23 AM, Albert <zinki...@gmail.com> wrote: > > >>>>> > > >>>>> if it is not SQL standard, it's just a matter of categorizing it to > > >>> some > > >>>>> dialect ? > > >>>>> > > >>>>>> On Wed, Apr 4, 2018 at 10:19 AM, Michael Mior <mm...@uwaterloo.ca > > > > >>>> wrote: > > >>>>>> > > >>>>>> Apologies for my silence. I don't really have thoughts on the > matter > > >>> at > > >>>>>> this point. It might be helpful if you can give an example of what > > >>>> you're > > >>>>>> proposing. Unless I'm missing something (very possible), it's not > > >> part > > >>>> of > > >>>>>> the SQL standard. > > >>>>>> > > >>>>>> -- > > >>>>>> Michael Mior > > >>>>>> mm...@apache.org > > >>>>>> > > >>>>>> 2018-04-03 18:48 GMT-04:00 Shuyi Chen <suez1...@gmail.com>: > > >>>>>> > > >>>>>>> Friendly ping, any thoughts? Much appreciated. > > >>>>>>> > > >>>>>>> Shuyi > > >>>>>>> > > >>>>>>>> On Tue, Mar 27, 2018 at 11:59 PM, Shuyi Chen < > suez1...@gmail.com> > > >>>> wrote: > > >>>>>>>> > > >>>>>>>> Hi community, > > >>>>>>>> > > >>>>>>>> I am thinking of adding the following support in Calcite to > > >> support > > >>>>>> named > > >>>>>>>> row construction, e.g. > > >>>>>>>> > > >>>>>>>> SELECT (a as first_name, b as last_name, (c as zip code, d as > > >>> street, > > >>>> e > > >>>>>>> as > > >>>>>>>> state) as address) as record FROM example_table > > >>>>>>>> > > >>>>>>>> The output will be struct with field names specified in the SQL. > > >> The > > >>>>>>> usage > > >>>>>>>> scenario is that say, in streaming SQL, the downstream sink's > > >> schema > > >>>>>> can > > >>>>>>>> not be changed, so we will need to use SQL to construct a struct > > >>> with > > >>>>>> the > > >>>>>>>> proper naming according to the schema in order to write to the > > >>>>>> downstream > > >>>>>>>> sinks. Thanks a lot. > > >>>>>>>> > > >>>>>>>> Shuyi > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> "So you have to trust that the dots will somehow connect in your > > >>>>>> future." > > >>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> "So you have to trust that the dots will somehow connect in your > > >>>> future." > > >>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> ~~~~~~~~~~~~~~~ > > >>>>> no mistakes > > >>>>> ~~~~~~~~~~~~~~~~~~ > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> "So you have to trust that the dots will somehow connect in your > > future." > > >>> > > >> > > > > > > -- > "So you have to trust that the dots will somehow connect in your future." >