Thanks for all you discussions Timo, Jark Wu, Lin Li, Fabian
I have fired a new design doc in [1] which reorganized from the MVP doc[2] and
the initial doc[3] that both proposed by Shuyi Chen.
The main diff is that i extend the create table DDL to support complex sql type
(array, map and
he DDL Draft
> >>> doc
> >>>>>>>>>>> later[1].
> >>>>>>>>>>>
> >>>>>>>>>>> 5. Schema declaration
> >>>>>>>>>>> "if users want to declare computed
ther aspects:
7. Hive compatibility. Since Flink SQL will soon be able to
operate
on
Hive metadata and data, it's an add-on benefit if we can be
compatible
with
Hive syntax/semantics while following ANSI standard. At
least
we
should
be
as close as possible. Hive DDL can found at
https://
;>> upfront if they can use an INSERT INTO here or not.
> > > >>>>>>>>
> > > >>>>>>>> 6. Partitioning and keys: @Lin: I would like to include this
> in
> > > >> the
> > > >
> match but this is true for both directions:
> > >>>>>>>> table schema "derives" format schema and format schema "derives"
> > >>> table
> > >>>>>>>> schema.
> > >>>>>>>>
> > >>>>>>>> 7. Hive compatibility: @Xuefu:
not.
> > > >>>>>>>>
> > > >>>>>>>> 6. Partitioning and keys: @Lin: I would like to include this
> in
> > > >> the
> > > >>>>>>>> design given that Hive integration and Kaf
r both directions:
> > >>>>>>>> table schema "derives" format schema and format schema "derives"
> > >>> table
> > >>>>>>>> schema.
> > >>>>>>>>
> > >>>>>>>> 7. Hive compatibility: @Xuefu: I agree that Hive is popular but
> we
> > >>>>>>>> shou
t;>>>>>> does the syntax for marking look like? Also in case of timestamps
> >>> that
> >>>>>>>> are nested in the schema?
> >>>>>>>>
> >>>>>>>> 4b. how can we write out a timestamp into the message header?:
> >>>>>>>> I agree to simply ignore computed columns when writing out. This
&
add-on benefit if we can be
compatible
with
Hive syntax/semantics while following ANSI standard. At least we
should
be
as close as possible. Hive DDL can found at
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
Thanks,
Xuefu
--
t;>>> rowtime,
> > > >>>>> no twice defined. If it is not a Long/Timestamp, we use computed
> > > >> column
> > > >>>> to
> > > >>>>> get an expected timestamp column to be rowtime, is this what you
> > mean
>
e write out a timestamp into the message header?
> > >>>>> That's a good point. I think computed column is just a
> virtual
> > >>>> column
> > >>>>> on table which is only relative to reading. If we want to write to
> a
> > >
>>>> to
> >>>>> the regular schema?
> >>>>> Separating watermark into a special clause similar to
> >> PARTITIONED
> >>>> BY is
> >>>>> also a good idea. Conceptually, it's fine to put watermark in schema
> >>&
n
Hive metadata and data, it's an add-on benefit if we can be
compatible
with
Hive syntax/semantics while following ANSI standard. At least we
should
be
as close as possible. Hive DDL can found at
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
Thanks,
Xuefu
--
; > > >
> > > > 5. Schema declaration:
> > > > I like the proposal to omit the schema if we can get the schema from
> > > > external storage or something schema file. Actually, we have already
> > > > encountered this requirement in out company.
> > > >
> > > >
&g
exes?view=sql-server-2017
> > >
> > > Best,
> > > Jark
> > >
> > > On Thu, 6 Dec 2018 at 12:09, Zhang, Xuefu
> > wrote:
> > >
> > >> Hi Timo/Shuyi/Lin,
> > >>
> > >> Thanks for the discussions. It seems that we are converging to
> something
> > >> meaningful. Here are some of my thoughts:
> &
e okay if schema declaration is always needed. While there
> >> might be some duplication sometimes, it's not always true. For example,
> >> external schema may not be exactly matching Flink schema. For instance,
> >> data types. Even if so, perfect match is not required. For
benefit if we can be compatible with
Hive syntax/semantics while following ANSI standard. At least we should be
as close as possible. Hive DDL can found at
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
Thanks,
Xuefu
------------------
Se
can found at
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
>
> Thanks,
> Xuefu
>
>
>
> ----------------------
> Sender:Lin Li
> Sent at:2018 Dec 6 (Thu) 10:49
> Recipient:dev
> Subject:Re: [DISCUSS] Flink SQL
://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
Thanks,
Xuefu
--
Sender:Lin Li
Sent at:2018 Dec 6 (Thu) 10:49
Recipient:dev
Subject:Re: [DISCUSS] Flink SQL DDL Design
Hi Timo and Shuyi,
thanks for your feedback.
1
gt;>>>>>> other fields). The “AS” keyword defines watermark strategy,
> >>>> such
> >>>>> as
> >>>>>>>>> BOUNDED
> >>>>>>>>>> WITH OFFSET (covers almost all the requirements) and
> >>> ASCENDING.
> >>>>>>>>>> When the expected rowtime field does not exist in the
> >> schema,
> >>>&g
---
Sender:Timo Walther
Sent at:2018 Nov 27 (Tue) 16:21
Recipient:dev
Subject:Re: [DISCUSS] Flink SQL DDL Design
Thanks for offering your help here, Xuefu. It would
be
great
to
move
these efforts forward. I agree that the DDL is
somehow
releated
to
gt; > > > > > > [ WATERMARK watermarkName FOR rowTimeColumn AS
> > > > > > > > > > withOffset(rowTimeColumn, offset) ] ) [ WITH (
> > > tableOption
> > > > [
> > > > > ,
gt; > > | [ BOOLEAN ]
> > > > > > > > > | [ TINYINT ]
> > > > > > > > > | [ SMALLINT ]
> > > > > > > > > | [ INT ]
> > > > > > > > > |
> > | [ DATE ]
> > > > > > > > | [ TIME ]
> > > > > > > > | [ TIMESTAMP ]
> > > > > > > > | [ VARBINARY ]
> > > > > > > > }
> > > > > > > >
> > > > > > > > computedColumnDefinition ::=
> > > > > > &g
; > > > > >
> > > > > > > tableIndex ::=
> > > > > > > [ UNIQUE ] INDEX indexName
> > > > > > > (columnName [, columnName]* )
> > > > > > >
> > > > > > > rowTimeColumn ::=
> > > > > > > columnName
> > > > > > >
> > > > &
t; > >
> > > > > > CREATE VIEW viewName
> > > > > > [
> > > > > > ( columnName [, columnName]* )
> > > > > > ]
> > > > > > AS queryStatement;
> > > > > >
> > > > > > CREATE FUNCTION
> > > > > >
> > > > > > CREATE FUNCTION functionName
> > >
gt; className ::=
> > > > > fully qualified name
> > > > >
> > > > >
> > > > > Shuyi Chen 于2018年11月28日周三 上午3:28写道:
> > > > >
> > > > > > Thanks a lot, Timo and Xuefu. Yes, I think we can finalize the
> > design
> > > > doc
> > > > > > first and start implementation w/o th
gt; className ::=
> > > > > fully qualified name
> > > > >
> > > > >
> > > > > Shuyi Chen 于2018年11月28日周三 上午3:28写道:
> > > > >
> > > > > > Thanks a lot, Timo and Xuefu. Yes, I think we can finalize the
> > design
> > > > doc
> > > > > > first and start impleme
wrote:
> +1 Sounds great!
>
>
> --
> Sender:Shuyi Chen
> Sent at:2018 Nov 29 (Thu) 06:56
> Recipient:dev
> Subject:Re: [DISCUSS] Flink SQL DDL Design
>
> Thanks a lot, Shaoxuan, Jack and Lin. We should
+1 Sounds great!
--
Sender:Shuyi Chen
Sent at:2018 Nov 29 (Thu) 06:56
Recipient:dev
Subject:Re: [DISCUSS] Flink SQL DDL Design
Thanks a lot, Shaoxuan, Jack and Lin. We should definitely collaborate
here, we have also our own DDL
> key-value pairs, so that it will make integration with Hive DDL (or
> > > > others,
> > > > > e.g. Beam DDL) easier.
> > > > >
> > > > > I'll run a final pass over the design doc and finalize the design
> in
> > > the
> &g
s. And we can start creating tasks and collaborate on the
> > > > implementation. Thanks a lot for all the comments and inputs.
> > > >
> > > > Cheers!
> > > > Shuyi
> > > >
> > > > On Tue, Nov 27, 2018 at 7:02 AM Zhang, Xuefu <
> xuef...@alibaba-inc.com>
> > > > wr
cked
> > > by
> > > > connector API. We can leave the unknown out while defining the basic
> > > syntax.
> > > >
> > > > @Shuyi
> > > >
> > > > As commented in the doc, I think we can probably stick with simple
> > syntax
> > > > with general properties, without extending the syntax too muc
s,
> > > > but we are definitely interested in moving this forward. I think once
> > the
> > > > unified connector API design [1] is done, we can finalize the DDL
> > design
> > > as
> > > > well and start creating concrete subtasks to collaborate on the
> > >
hen we can divide the tasks for better collaboration.
> >
> > Please let me know if there are any questions or suggestions.
> >
> > Thanks,
> > Xuefu
> >
> >
> >
> >
> > --
> > Sender:Timo Walther
> > Sent at:2018 Nov 27
--
> Sender:Timo Walther
> Sent at:2018 Nov 27 (Tue) 16:21
> Recipient:dev
> Subject:Re: [DISCUSS] Flink SQL DDL Design
>
> Thanks for offering your help here, Xuefu. It would be great to move
> these efforts forward. I agree that the DD
e to move this forward. We can
>> collaborate.
>>
>> Thanks,
>>
>> Xuefu
>>
>>
>> ----------------------
>> 发件人:wenlong.lwl
>> 日 期:2018年11月05日 11:15:35
>> 收件人:
>> 主 题:Re: [DISCUSS] Flink SQL DDL Design
>>
>
收件人:
主 题:Re: [DISCUSS] Flink SQL DDL Design
Hi, Shuyi, thanks for the proposal.
I have two concerns about the table ddl:
1. how about remove the source/sink mark from the ddl, because it is not
necessary, the framework determine the table referred is a source or a sink
according to the context
t; We have some dedicated resource and like to move this forward. We can
> collaborate.
>
> Thanks,
>
> Xuefu
>
>
> --
> 发件人:wenlong.lwl
> 日 期:2018年11月05日 11:15:35
> 收件人:
> 主 题:Re: [DISCUSS] Flink SQL DDL D
Hi Wenlong, thanks a lot for the comments.
1) I agree we can infer the table type from the queries if the Flink job is
static. However, for SQL client cases, the query is adhoc, dynamic, and not
known beforehand. In such case, we might want to enforce the table open
mode at startup time, so users
人:
主 题:Re: [DISCUSS] Flink SQL DDL Design
Hi, Shuyi, thanks for the proposal.
I have two concerns about the table ddl:
1. how about remove the source/sink mark from the ddl, because it is not
necessary, the framework determine the table referred is a source or a sink
according to the context
Hi, Shuyi, thanks for the proposal.
I have two concerns about the table ddl:
1. how about remove the source/sink mark from the ddl, because it is not
necessary, the framework determine the table referred is a source or a sink
according to the context of the query using the table. it will be more
+1. Thanks for putting the proposal together Shuyi.
DDL has been brought up in a couple of times previously [1,2]. Utilizing
DDL will definitely be a great extension to the current Flink SQL to
systematically support some of the previously brought up features such as
[3]. And it will also be
Thanks Shuyi!
I left some comments there. I think the design of SQL DDL and Flink-Hive
integration/External catalog enhancements will work closely with each
other. Hope we are well aligned on the directions of the two designs, and I
look forward to working with you guys on both!
Bowen
On Thu,
Hi everyone,
SQL DDL support has been a long-time ask from the community. Current Flink
SQL support only DML (e.g. SELECT and INSERT statements). In its current
form, Flink SQL users still need to define/create table sources and sinks
programmatically in Java/Scala. Also, in SQL Client, without
45 matches
Mail list logo