Re: Supporting named row construction in Calcite SQL

Michael Mior Thu, 12 Apr 2018 05:33:59 -0700

FYI, I talked to Julian earlier this week and he will be adding Beam to the
powered by page since he has a doc for generating the image with the logos.


--
Michael Mior
mm...@apache.org

2018-04-12 4:06 GMT-04:00 Shuyi Chen <suez1...@gmail.com>:

> @Andrew, great to see BEAM is also using Calcite streaming SQL. Maybe you
> can help adding an entry in the Calcite powered_by page
> <https://calcite.apache.org/docs/powered_by.html> for BEAM  by editing
> site/_docs/powered_by.md
> <https://github.com/apache/calcite/pull/657#diff-
> 1aa810bc92051b46555ee4caf24bd6c3>.
> Also,
> can you explain a bit more on how you plan to use streaming SQL to
> transform
> arbitrary JSON objects with and w/o the AS STRUCT syntax?
> For adding DDL in BEAM, please take a look at the server module and the
> TYPE DDL (https://issues.apache.org/jira/browse/CALCITE-2045) I am adding.
> Let me know if you have any comments and need any help.
>
> @Rong, I think it's better to conform with the ROW SQL standard, and add
> new grammar to handle named struct construction.
> This should work with the TYPE DDL, and we should be able to CAST the
> created STRUCT as a custom type defined using DDL.
>
> @Julian, thanks for the suggestions. I think the AS STRUCT addition should
> do it.
>
> Shuyi
>
> On Mon, Apr 9, 2018 at 4:36 AM, Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
>
> > For what it’s worth, ROW is standard SQL. If it does what you need, we
> > should use it.
> >
> > Reading your case quickly, I perceived that you needed a concise way to
> > assign field names, and AS STRUCT seemed to do that.
> >
> > But staying within the standard is always preferred. BigQuery isn’t
> always
> > good at that.
> >
> > Julian
> >
> > > On Apr 5, 2018, at 10:18, Rong Rong <walter...@gmail.com> wrote:
> > >
> > > Thanks for the fantastic proposal @shuyi. I think the STRUCT idea is
> > great
> > > considering ROW is not standard SQL either. As a user of calcite I
> have a
> > > couple questions.
> > >
> > > Since ROW constructor is so similar with STRUCT, would it be a good
> idea
> > to
> > > consolidate the two syntax? Or have a clear distinction between?
> > >
> > > Whats the relationship going forward with DDL, for example
> CALCITE-2045.
> > > DDL seems more flexible in terms of defining the structure not just on
> > > field names but also field types. Maybe @andrew can share more on the
> use
> > > cases on calcite on beam steaming integration?
> > >
> > > Thanks,
> > > Rong
> > >
> > >
> > > On Thu, Apr 5, 2018, 10:05 AM Andrew Pilloud
> <apill...@google.com.invalid
> > >
> > > wrote:
> > >
> > >> As a user of Calcite working on adding streaming SQL to Apache Beam
> this
> > >> sounds like a fantastic proposal. Our initial goal is to be able to
> run
> > SQL
> > >> queries that transform arbitrary JSON objects. Without this syntax
> > objects
> > >> must be flattened when they pass through the transform. Is this
> > something
> > >> that might make it into 1.17?
> > >>
> > >> We have also had some discussion about adding DDL to Beam so a user
> can
> > >> describe the schema of a stream of JSON in pure SQL. Our current
> though
> > is
> > >> to use Big Query compatible STRUCT and ARRAY syntax. Big Query is a
> > popular
> > >> sink for our users. Syntax compatible with Big Query would be a big
> plus
> > >> for us.
> > >>
> > >> Andrew
> > >>
> > >>> On Thu, Apr 5, 2018 at 12:43 AM Shuyi Chen <suez1...@gmail.com>
> wrote:
> > >>>
> > >>> @Michael, @Albert, yes, I dont think it is SQL standard. But I think
> > it's
> > >>> very useful in the context of streaming SQL, e.g. Flink SQL, where
> the
> > >>> sinks can be a database or endpoints with defined protobuf/thrift
> > schema.
> > >>> They usually have complex structure. Supporting complex structure in
> > SQL
> > >>> output will make it much easier to write to different sinks with
> > >> predefined
> > >>> schemas in a unified way,
> > >>>
> > >>> @julian, that's great suggestion, I think instead of extending the
> ROW
> > >>> constructor, which is not SQL standard, adding a new extension might
> be
> > >> the
> > >>> right way to go. Looking at the STRUCT big query syntax, we can
> > implement
> > >>> something like the following:
> > >>>
> > >>> SELECT STRUCT(a as first_name, b as last_name, STRUCT(c as zip code,
> d
> > as
> > >>> street, e as state) as address) as record FROM example_table
> > >>>
> > >>> On Wed, Apr 4, 2018 at 5:51 PM, Julian Hyde <jhyde.apa...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> If I recall correctly, Google BigQuery has SELECT AS STRUCT. It’s
> not
> > >>>> standard, but if it does what you need we could consider adopting
> that
> > >>>> syntax.
> > >>>>
> > >>>> Julian
> > >>>>
> > >>>>> On Apr 4, 2018, at 10:23 AM, Albert <zinki...@gmail.com> wrote:
> > >>>>>
> > >>>>> if it is not SQL standard, it's just a matter of categorizing it to
> > >>> some
> > >>>>> dialect ?
> > >>>>>
> > >>>>>> On Wed, Apr 4, 2018 at 10:19 AM, Michael Mior <mm...@uwaterloo.ca
> >
> > >>>> wrote:
> > >>>>>>
> > >>>>>> Apologies for my silence. I don't really have thoughts on the
> matter
> > >>> at
> > >>>>>> this point. It might be helpful if you can give an example of what
> > >>>> you're
> > >>>>>> proposing. Unless I'm missing something (very possible), it's not
> > >> part
> > >>>> of
> > >>>>>> the SQL standard.
> > >>>>>>
> > >>>>>> --
> > >>>>>> Michael Mior
> > >>>>>> mm...@apache.org
> > >>>>>>
> > >>>>>> 2018-04-03 18:48 GMT-04:00 Shuyi Chen <suez1...@gmail.com>:
> > >>>>>>
> > >>>>>>> Friendly ping, any thoughts? Much appreciated.
> > >>>>>>>
> > >>>>>>> Shuyi
> > >>>>>>>
> > >>>>>>>> On Tue, Mar 27, 2018 at 11:59 PM, Shuyi Chen <
> suez1...@gmail.com>
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>> Hi community,
> > >>>>>>>>
> > >>>>>>>> I am thinking of adding the following support in Calcite to
> > >> support
> > >>>>>> named
> > >>>>>>>> row construction, e.g.
> > >>>>>>>>
> > >>>>>>>> SELECT (a as first_name, b as last_name, (c as zip code, d as
> > >>> street,
> > >>>> e
> > >>>>>>> as
> > >>>>>>>> state) as address) as record FROM example_table
> > >>>>>>>>
> > >>>>>>>> The output will be struct with field names specified in the SQL.
> > >> The
> > >>>>>>> usage
> > >>>>>>>> scenario is that say, in streaming SQL, the downstream sink's
> > >> schema
> > >>>>>> can
> > >>>>>>>> not be changed, so we will need to use SQL to construct a struct
> > >>> with
> > >>>>>> the
> > >>>>>>>> proper naming according to the schema in order to write to the
> > >>>>>> downstream
> > >>>>>>>> sinks. Thanks a lot.
> > >>>>>>>>
> > >>>>>>>> Shuyi
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>> "So you have to trust that the dots will somehow connect in your
> > >>>>>> future."
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> "So you have to trust that the dots will somehow connect in your
> > >>>> future."
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> ~~~~~~~~~~~~~~~
> > >>>>> no mistakes
> > >>>>> ~~~~~~~~~~~~~~~~~~
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> "So you have to trust that the dots will somehow connect in your
> > future."
> > >>>
> > >>
> >
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>

Re: Supporting named row construction in Calcite SQL

Reply via email to