Re: Supporting named row construction in Calcite SQL

Julian Hyde Mon, 09 Apr 2018 04:37:12 -0700

For what it’s worth, ROW is standard SQL. If it does what you need, we should 
use it.


Reading your case quickly, I perceived that you needed a concise way to assign 
field names, and AS STRUCT seemed to do that.

But staying within the standard is always preferred. BigQuery isn’t always good 
at that. 

Julian

> On Apr 5, 2018, at 10:18, Rong Rong <[email protected]> wrote:
> 
> Thanks for the fantastic proposal @shuyi. I think the STRUCT idea is great
> considering ROW is not standard SQL either. As a user of calcite I have a
> couple questions.
> 
> Since ROW constructor is so similar with STRUCT, would it be a good idea to
> consolidate the two syntax? Or have a clear distinction between?
> 
> Whats the relationship going forward with DDL, for example CALCITE-2045.
> DDL seems more flexible in terms of defining the structure not just on
> field names but also field types. Maybe @andrew can share more on the use
> cases on calcite on beam steaming integration?
> 
> Thanks,
> Rong
> 
> 
> On Thu, Apr 5, 2018, 10:05 AM Andrew Pilloud <[email protected]>
> wrote:
> 
>> As a user of Calcite working on adding streaming SQL to Apache Beam this
>> sounds like a fantastic proposal. Our initial goal is to be able to run SQL
>> queries that transform arbitrary JSON objects. Without this syntax objects
>> must be flattened when they pass through the transform. Is this something
>> that might make it into 1.17?
>> 
>> We have also had some discussion about adding DDL to Beam so a user can
>> describe the schema of a stream of JSON in pure SQL. Our current though is
>> to use Big Query compatible STRUCT and ARRAY syntax. Big Query is a popular
>> sink for our users. Syntax compatible with Big Query would be a big plus
>> for us.
>> 
>> Andrew
>> 
>>> On Thu, Apr 5, 2018 at 12:43 AM Shuyi Chen <[email protected]> wrote:
>>> 
>>> @Michael, @Albert, yes, I dont think it is SQL standard. But I think it's
>>> very useful in the context of streaming SQL, e.g. Flink SQL, where the
>>> sinks can be a database or endpoints with defined protobuf/thrift schema.
>>> They usually have complex structure. Supporting complex structure in SQL
>>> output will make it much easier to write to different sinks with
>> predefined
>>> schemas in a unified way,
>>> 
>>> @julian, that's great suggestion, I think instead of extending the ROW
>>> constructor, which is not SQL standard, adding a new extension might be
>> the
>>> right way to go. Looking at the STRUCT big query syntax, we can implement
>>> something like the following:
>>> 
>>> SELECT STRUCT(a as first_name, b as last_name, STRUCT(c as zip code, d as
>>> street, e as state) as address) as record FROM example_table
>>> 
>>> On Wed, Apr 4, 2018 at 5:51 PM, Julian Hyde <[email protected]>
>>> wrote:
>>> 
>>>> If I recall correctly, Google BigQuery has SELECT AS STRUCT. It’s not
>>>> standard, but if it does what you need we could consider adopting that
>>>> syntax.
>>>> 
>>>> Julian
>>>> 
>>>>> On Apr 4, 2018, at 10:23 AM, Albert <[email protected]> wrote:
>>>>> 
>>>>> if it is not SQL standard, it's just a matter of categorizing it to
>>> some
>>>>> dialect ?
>>>>> 
>>>>>> On Wed, Apr 4, 2018 at 10:19 AM, Michael Mior <[email protected]>
>>>> wrote:
>>>>>> 
>>>>>> Apologies for my silence. I don't really have thoughts on the matter
>>> at
>>>>>> this point. It might be helpful if you can give an example of what
>>>> you're
>>>>>> proposing. Unless I'm missing something (very possible), it's not
>> part
>>>> of
>>>>>> the SQL standard.
>>>>>> 
>>>>>> --
>>>>>> Michael Mior
>>>>>> [email protected]
>>>>>> 
>>>>>> 2018-04-03 18:48 GMT-04:00 Shuyi Chen <[email protected]>:
>>>>>> 
>>>>>>> Friendly ping, any thoughts? Much appreciated.
>>>>>>> 
>>>>>>> Shuyi
>>>>>>> 
>>>>>>>> On Tue, Mar 27, 2018 at 11:59 PM, Shuyi Chen <[email protected]>
>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi community,
>>>>>>>> 
>>>>>>>> I am thinking of adding the following support in Calcite to
>> support
>>>>>> named
>>>>>>>> row construction, e.g.
>>>>>>>> 
>>>>>>>> SELECT (a as first_name, b as last_name, (c as zip code, d as
>>> street,
>>>> e
>>>>>>> as
>>>>>>>> state) as address) as record FROM example_table
>>>>>>>> 
>>>>>>>> The output will be struct with field names specified in the SQL.
>> The
>>>>>>> usage
>>>>>>>> scenario is that say, in streaming SQL, the downstream sink's
>> schema
>>>>>> can
>>>>>>>> not be changed, so we will need to use SQL to construct a struct
>>> with
>>>>>> the
>>>>>>>> proper naming according to the schema in order to write to the
>>>>>> downstream
>>>>>>>> sinks. Thanks a lot.
>>>>>>>> 
>>>>>>>> Shuyi
>>>>>>>> 
>>>>>>>> --
>>>>>>>> "So you have to trust that the dots will somehow connect in your
>>>>>> future."
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> "So you have to trust that the dots will somehow connect in your
>>>> future."
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> ~~~~~~~~~~~~~~~
>>>>> no mistakes
>>>>> ~~~~~~~~~~~~~~~~~~
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> "So you have to trust that the dots will somehow connect in your future."
>>> 
>>

Re: Supporting named row construction in Calcite SQL

Reply via email to