Beam and HerdDB are now on “powered by”: https://calcite.apache.org/docs/powered_by.html <https://calcite.apache.org/docs/powered_by.html>
> On Apr 12, 2018, at 5:33 AM, Michael Mior <[email protected]> wrote: > > FYI, I talked to Julian earlier this week and he will be adding Beam to the > powered by page since he has a doc for generating the image with the logos. > > -- > Michael Mior > [email protected] > > 2018-04-12 4:06 GMT-04:00 Shuyi Chen <[email protected]>: > >> @Andrew, great to see BEAM is also using Calcite streaming SQL. Maybe you >> can help adding an entry in the Calcite powered_by page >> <https://calcite.apache.org/docs/powered_by.html> for BEAM by editing >> site/_docs/powered_by.md >> <https://github.com/apache/calcite/pull/657#diff- >> 1aa810bc92051b46555ee4caf24bd6c3>. >> Also, >> can you explain a bit more on how you plan to use streaming SQL to >> transform >> arbitrary JSON objects with and w/o the AS STRUCT syntax? >> For adding DDL in BEAM, please take a look at the server module and the >> TYPE DDL (https://issues.apache.org/jira/browse/CALCITE-2045) I am adding. >> Let me know if you have any comments and need any help. >> >> @Rong, I think it's better to conform with the ROW SQL standard, and add >> new grammar to handle named struct construction. >> This should work with the TYPE DDL, and we should be able to CAST the >> created STRUCT as a custom type defined using DDL. >> >> @Julian, thanks for the suggestions. I think the AS STRUCT addition should >> do it. >> >> Shuyi >> >> On Mon, Apr 9, 2018 at 4:36 AM, Julian Hyde <[email protected]> >> wrote: >> >>> For what it’s worth, ROW is standard SQL. If it does what you need, we >>> should use it. >>> >>> Reading your case quickly, I perceived that you needed a concise way to >>> assign field names, and AS STRUCT seemed to do that. >>> >>> But staying within the standard is always preferred. BigQuery isn’t >> always >>> good at that. >>> >>> Julian >>> >>>> On Apr 5, 2018, at 10:18, Rong Rong <[email protected]> wrote: >>>> >>>> Thanks for the fantastic proposal @shuyi. I think the STRUCT idea is >>> great >>>> considering ROW is not standard SQL either. As a user of calcite I >> have a >>>> couple questions. >>>> >>>> Since ROW constructor is so similar with STRUCT, would it be a good >> idea >>> to >>>> consolidate the two syntax? Or have a clear distinction between? >>>> >>>> Whats the relationship going forward with DDL, for example >> CALCITE-2045. >>>> DDL seems more flexible in terms of defining the structure not just on >>>> field names but also field types. Maybe @andrew can share more on the >> use >>>> cases on calcite on beam steaming integration? >>>> >>>> Thanks, >>>> Rong >>>> >>>> >>>> On Thu, Apr 5, 2018, 10:05 AM Andrew Pilloud >> <[email protected] >>>> >>>> wrote: >>>> >>>>> As a user of Calcite working on adding streaming SQL to Apache Beam >> this >>>>> sounds like a fantastic proposal. Our initial goal is to be able to >> run >>> SQL >>>>> queries that transform arbitrary JSON objects. Without this syntax >>> objects >>>>> must be flattened when they pass through the transform. Is this >>> something >>>>> that might make it into 1.17? >>>>> >>>>> We have also had some discussion about adding DDL to Beam so a user >> can >>>>> describe the schema of a stream of JSON in pure SQL. Our current >> though >>> is >>>>> to use Big Query compatible STRUCT and ARRAY syntax. Big Query is a >>> popular >>>>> sink for our users. Syntax compatible with Big Query would be a big >> plus >>>>> for us. >>>>> >>>>> Andrew >>>>> >>>>>> On Thu, Apr 5, 2018 at 12:43 AM Shuyi Chen <[email protected]> >> wrote: >>>>>> >>>>>> @Michael, @Albert, yes, I dont think it is SQL standard. But I think >>> it's >>>>>> very useful in the context of streaming SQL, e.g. Flink SQL, where >> the >>>>>> sinks can be a database or endpoints with defined protobuf/thrift >>> schema. >>>>>> They usually have complex structure. Supporting complex structure in >>> SQL >>>>>> output will make it much easier to write to different sinks with >>>>> predefined >>>>>> schemas in a unified way, >>>>>> >>>>>> @julian, that's great suggestion, I think instead of extending the >> ROW >>>>>> constructor, which is not SQL standard, adding a new extension might >> be >>>>> the >>>>>> right way to go. Looking at the STRUCT big query syntax, we can >>> implement >>>>>> something like the following: >>>>>> >>>>>> SELECT STRUCT(a as first_name, b as last_name, STRUCT(c as zip code, >> d >>> as >>>>>> street, e as state) as address) as record FROM example_table >>>>>> >>>>>> On Wed, Apr 4, 2018 at 5:51 PM, Julian Hyde <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> If I recall correctly, Google BigQuery has SELECT AS STRUCT. It’s >> not >>>>>>> standard, but if it does what you need we could consider adopting >> that >>>>>>> syntax. >>>>>>> >>>>>>> Julian >>>>>>> >>>>>>>> On Apr 4, 2018, at 10:23 AM, Albert <[email protected]> wrote: >>>>>>>> >>>>>>>> if it is not SQL standard, it's just a matter of categorizing it to >>>>>> some >>>>>>>> dialect ? >>>>>>>> >>>>>>>>> On Wed, Apr 4, 2018 at 10:19 AM, Michael Mior <[email protected] >>> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> Apologies for my silence. I don't really have thoughts on the >> matter >>>>>> at >>>>>>>>> this point. It might be helpful if you can give an example of what >>>>>>> you're >>>>>>>>> proposing. Unless I'm missing something (very possible), it's not >>>>> part >>>>>>> of >>>>>>>>> the SQL standard. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Michael Mior >>>>>>>>> [email protected] >>>>>>>>> >>>>>>>>> 2018-04-03 18:48 GMT-04:00 Shuyi Chen <[email protected]>: >>>>>>>>> >>>>>>>>>> Friendly ping, any thoughts? Much appreciated. >>>>>>>>>> >>>>>>>>>> Shuyi >>>>>>>>>> >>>>>>>>>>> On Tue, Mar 27, 2018 at 11:59 PM, Shuyi Chen < >> [email protected]> >>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi community, >>>>>>>>>>> >>>>>>>>>>> I am thinking of adding the following support in Calcite to >>>>> support >>>>>>>>> named >>>>>>>>>>> row construction, e.g. >>>>>>>>>>> >>>>>>>>>>> SELECT (a as first_name, b as last_name, (c as zip code, d as >>>>>> street, >>>>>>> e >>>>>>>>>> as >>>>>>>>>>> state) as address) as record FROM example_table >>>>>>>>>>> >>>>>>>>>>> The output will be struct with field names specified in the SQL. >>>>> The >>>>>>>>>> usage >>>>>>>>>>> scenario is that say, in streaming SQL, the downstream sink's >>>>> schema >>>>>>>>> can >>>>>>>>>>> not be changed, so we will need to use SQL to construct a struct >>>>>> with >>>>>>>>> the >>>>>>>>>>> proper naming according to the schema in order to write to the >>>>>>>>> downstream >>>>>>>>>>> sinks. Thanks a lot. >>>>>>>>>>> >>>>>>>>>>> Shuyi >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> "So you have to trust that the dots will somehow connect in your >>>>>>>>> future." >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> "So you have to trust that the dots will somehow connect in your >>>>>>> future." >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> ~~~~~~~~~~~~~~~ >>>>>>>> no mistakes >>>>>>>> ~~~~~~~~~~~~~~~~~~ >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> "So you have to trust that the dots will somehow connect in your >>> future." >>>>>> >>>>> >>> >> >> >> >> -- >> "So you have to trust that the dots will somehow connect in your future." >>
