Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-23 Thread Kurt Young
For histogram-based watermark strategy, one possible solution is that we
still use the stateless scalar function, and keep the stateful objects
directly
in the function. By doing that we will loose some information after the job
get restarted, but I think it might acceptable because histogram-based is
an approximate algorithm after all.

But I agree we will meet some troubles if we want to have some accurate
watermark computation logic. In this case, I would suggest to create a
dedicated upstream job to do the watermark calculation, save the value
into a field. Then in current job, we can just reference to the calculated
field and specify it as this job's watermark.

Best,
Kurt


On Mon, Sep 23, 2019 at 8:49 PM Jark Wu  wrote:

> Hi,
>
> Thanks Fabian for your reply. I agree with your point that the
> histogram-based case need the function to be stateful which is not
> supported currently and in this design.
> Maybe we can support stateful scalar function like TableAggregateFunction.
> We can further discuss how to support this in the future.
> I added this limitation in the "Complex Watermark Strategies" section.
>
> Btw, I also updated how to automatically apply the watermark assigner by
> the planner at the end of "Implementation" section [1].
> This can avoid every TableSource extending DefinedProctimeAttribute to
> carry time attribute information.
>
> If there is no objection, I would like to update the cwiki FLIP page and
> start a new voting process in the next days.
>
> Best,
> Jark
>
> [1]:
>
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#heading=h.qx7j56dotywd
>
>
> On Fri, 20 Sep 2019 at 22:18, Fabian Hueske  wrote:
>
> > Hi Jark,
> >
> > Thanks for the summary!
> > I like the proposal!
> >
> > It makes it very clear that an event time attribute is an existing column
> > on which watermark metadata is defined whereas a processing time
> attribute
> > is a computed field.
> >
> > I have one comment regarding the section on "Complex Watermark
> Strategies".
> > The proposal says that you can also use a scalar function.
> > I don't think that a "text book" scalar function would be sufficient for
> > more advanced strategies.
> > For example a histogram-based approach would need to remember the values
> of
> > the last x records.
> > The interface of a scalar function would still work for that, but it
> would
> > be a stateful function (which would not be OK for a scalar function).
> > I don't think it's a problem, but wanted to mention it here.
> >
> > Best, Fabian
> >
> > Am Do., 19. Sept. 2019 um 18:05 Uhr schrieb Jark Wu :
> >
> > > Hi everyone,
> > >
> > > Thanks all for the valuable suggestions and feedbacks so far.
> > > Before starting the vote, I would like to summarize the proposed DDL
> > syntax
> > > in the mailing list.
> > >
> > > ## Rowtime Attribute (Watermark Syntax)
> > >
> > > CREATE TABLE table_name (
> > >   WATERMARK FOR  AS 
> > > ) WITH (
> > >   ...
> > > )
> > >
> > > It marks an existing field  as the rowtime attribute, and
> the
> > > watermark is generated by the expression
> .
> > >  can be arbitrary expression which
> > returns a
> > > nullable BIGINT or TIMESTAMP as the watermark value.
> > >
> > > For common cases, users can use the following expressions to define a
> > > strategy.
> > > 1. Bounded Out of Orderness, the strategy can be "rowtimeField -
> INTERVAL
> > > 'string' timeUnit".
> > > 2. Preserve Watermark From Source, the strategy can be
> > > "SYSTEM_WATERMARK()".
> > >
> > > ## Proctime Attribute
> > >
> > > CREATE TABLE table_name (
> > >   ...
> > >   proc AS SYSTEM_PROCTIME()
> > > ) WITH (
> > >   ...
> > > )
> > >
> > > It uses the computed column syntax to add an additional column with
> > > proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.
> > >
> > > For more details and the implementations, please refer to the design
> doc:
> > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba
> > >
> > > Feel free to leave your further feedbacks!
> > >
> > > Thanks,
> > > Jark
> > >
> > > On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:
> > >
> > > > +1 to start vote process.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > Thanks all for joining the discussion in the doc[1].
> > > > > It seems that the discussion is converged and there is a consensus
> on
> > > the
> > > > > current FLIP document.
> > > > > If there is no objection, I would like to convert it into cwiki
> FLIP
> > > page
> > > > > and start voting process.
> > > > >
> > > > > For more details, please refer to the design doc (it is slightly
> > > changed
> > > > > since the initial proposal).
> > > > >
> > > > > Thanks,
> > > > > Jark
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> 

Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-23 Thread Jark Wu
Hi,

Thanks Fabian for your reply. I agree with your point that the
histogram-based case need the function to be stateful which is not
supported currently and in this design.
Maybe we can support stateful scalar function like TableAggregateFunction.
We can further discuss how to support this in the future.
I added this limitation in the "Complex Watermark Strategies" section.

Btw, I also updated how to automatically apply the watermark assigner by
the planner at the end of "Implementation" section [1].
This can avoid every TableSource extending DefinedProctimeAttribute to
carry time attribute information.

If there is no objection, I would like to update the cwiki FLIP page and
start a new voting process in the next days.

Best,
Jark

[1]:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#heading=h.qx7j56dotywd


On Fri, 20 Sep 2019 at 22:18, Fabian Hueske  wrote:

> Hi Jark,
>
> Thanks for the summary!
> I like the proposal!
>
> It makes it very clear that an event time attribute is an existing column
> on which watermark metadata is defined whereas a processing time attribute
> is a computed field.
>
> I have one comment regarding the section on "Complex Watermark Strategies".
> The proposal says that you can also use a scalar function.
> I don't think that a "text book" scalar function would be sufficient for
> more advanced strategies.
> For example a histogram-based approach would need to remember the values of
> the last x records.
> The interface of a scalar function would still work for that, but it would
> be a stateful function (which would not be OK for a scalar function).
> I don't think it's a problem, but wanted to mention it here.
>
> Best, Fabian
>
> Am Do., 19. Sept. 2019 um 18:05 Uhr schrieb Jark Wu :
>
> > Hi everyone,
> >
> > Thanks all for the valuable suggestions and feedbacks so far.
> > Before starting the vote, I would like to summarize the proposed DDL
> syntax
> > in the mailing list.
> >
> > ## Rowtime Attribute (Watermark Syntax)
> >
> > CREATE TABLE table_name (
> >   WATERMARK FOR  AS 
> > ) WITH (
> >   ...
> > )
> >
> > It marks an existing field  as the rowtime attribute, and the
> > watermark is generated by the expression .
> >  can be arbitrary expression which
> returns a
> > nullable BIGINT or TIMESTAMP as the watermark value.
> >
> > For common cases, users can use the following expressions to define a
> > strategy.
> > 1. Bounded Out of Orderness, the strategy can be "rowtimeField - INTERVAL
> > 'string' timeUnit".
> > 2. Preserve Watermark From Source, the strategy can be
> > "SYSTEM_WATERMARK()".
> >
> > ## Proctime Attribute
> >
> > CREATE TABLE table_name (
> >   ...
> >   proc AS SYSTEM_PROCTIME()
> > ) WITH (
> >   ...
> > )
> >
> > It uses the computed column syntax to add an additional column with
> > proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.
> >
> > For more details and the implementations, please refer to the design doc:
> >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba
> >
> > Feel free to leave your further feedbacks!
> >
> > Thanks,
> > Jark
> >
> > On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:
> >
> > > +1 to start vote process.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Thanks all for joining the discussion in the doc[1].
> > > > It seems that the discussion is converged and there is a consensus on
> > the
> > > > current FLIP document.
> > > > If there is no objection, I would like to convert it into cwiki FLIP
> > page
> > > > and start voting process.
> > > >
> > > > For more details, please refer to the design doc (it is slightly
> > changed
> > > > since the initial proposal).
> > > >
> > > > Thanks,
> > > > Jark
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
> > > >
> > > > On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
> > > >
> > > > > After some review and discussion in the google document, I think
> it's
> > > > time
> > > > > to
> > > > > convert this design to a cwiki flip page and start voting process.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Thanks all for so much feedbacks received in the doc so far.
> > > > > > I saw a general agreement on using computed column to support
> > > proctime
> > > > > > attribute and extract timestamps.
> > > > > > So we will prepare a computed column FLIP and share in the dev ML
> > > soon.
> > > > > >
> > > > > > Feel free to leave more comments!
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, 6 Sep 2019 at 13:50, Dian Fu 
> > wrote:
> > > > > >
> > > > > > > Hi Jark,
> > > > > > >
> > > > > > > Thanks for 

Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-20 Thread Fabian Hueske
Hi Jark,

Thanks for the summary!
I like the proposal!

It makes it very clear that an event time attribute is an existing column
on which watermark metadata is defined whereas a processing time attribute
is a computed field.

I have one comment regarding the section on "Complex Watermark Strategies".
The proposal says that you can also use a scalar function.
I don't think that a "text book" scalar function would be sufficient for
more advanced strategies.
For example a histogram-based approach would need to remember the values of
the last x records.
The interface of a scalar function would still work for that, but it would
be a stateful function (which would not be OK for a scalar function).
I don't think it's a problem, but wanted to mention it here.

Best, Fabian

Am Do., 19. Sept. 2019 um 18:05 Uhr schrieb Jark Wu :

> Hi everyone,
>
> Thanks all for the valuable suggestions and feedbacks so far.
> Before starting the vote, I would like to summarize the proposed DDL syntax
> in the mailing list.
>
> ## Rowtime Attribute (Watermark Syntax)
>
> CREATE TABLE table_name (
>   WATERMARK FOR  AS 
> ) WITH (
>   ...
> )
>
> It marks an existing field  as the rowtime attribute, and the
> watermark is generated by the expression .
>  can be arbitrary expression which returns a
> nullable BIGINT or TIMESTAMP as the watermark value.
>
> For common cases, users can use the following expressions to define a
> strategy.
> 1. Bounded Out of Orderness, the strategy can be "rowtimeField - INTERVAL
> 'string' timeUnit".
> 2. Preserve Watermark From Source, the strategy can be
> "SYSTEM_WATERMARK()".
>
> ## Proctime Attribute
>
> CREATE TABLE table_name (
>   ...
>   proc AS SYSTEM_PROCTIME()
> ) WITH (
>   ...
> )
>
> It uses the computed column syntax to add an additional column with
> proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.
>
> For more details and the implementations, please refer to the design doc:
>
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba
>
> Feel free to leave your further feedbacks!
>
> Thanks,
> Jark
>
> On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:
>
> > +1 to start vote process.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
> >
> > > Hi everyone,
> > >
> > > Thanks all for joining the discussion in the doc[1].
> > > It seems that the discussion is converged and there is a consensus on
> the
> > > current FLIP document.
> > > If there is no objection, I would like to convert it into cwiki FLIP
> page
> > > and start voting process.
> > >
> > > For more details, please refer to the design doc (it is slightly
> changed
> > > since the initial proposal).
> > >
> > > Thanks,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
> > >
> > > On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
> > >
> > > > After some review and discussion in the google document, I think it's
> > > time
> > > > to
> > > > convert this design to a cwiki flip page and start voting process.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Thanks all for so much feedbacks received in the doc so far.
> > > > > I saw a general agreement on using computed column to support
> > proctime
> > > > > attribute and extract timestamps.
> > > > > So we will prepare a computed column FLIP and share in the dev ML
> > soon.
> > > > >
> > > > > Feel free to leave more comments!
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > >
> > > > > On Fri, 6 Sep 2019 at 13:50, Dian Fu 
> wrote:
> > > > >
> > > > > > Hi Jark,
> > > > > >
> > > > > > Thanks for bringing up this discussion and the detailed design
> doc.
> > > > This
> > > > > > is definitely a critical feature for streaming SQL jobs. I have
> > left
> > > a
> > > > > few
> > > > > > comments in the design doc.
> > > > > >
> > > > > > Thanks,
> > > > > > Dian
> > > > > >
> > > > > > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > > > > > >
> > > > > > > Thanks Jark for this topic, This will be very useful.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > ForwardXu
> > > > > > >
> > > > > > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > > > > > >
> > > > > > >> Thanks Jark for bring up this topic, this is definitely an
> > import
> > > > > > feature
> > > > > > >> for the SQL, especially the DDL users.
> > > > > > >>
> > > > > > >> I would spend some time to review this design doc, really
> > thanks.
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Danny Chan
> > > > > > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > > > > > >>> Hi everyone,
> > > > > > >>>
> > > > > > >>> I would like to start discussion about how to support time
> > > > attribute
> > > > > in
> > > > > > >> SQL
> > > > > > >>> DDL.
> > > > > > >>> In Flink 1.9, we already introduced a 

Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-19 Thread Jark Wu
Hi everyone,

Thanks all for the valuable suggestions and feedbacks so far.
Before starting the vote, I would like to summarize the proposed DDL syntax
in the mailing list.

## Rowtime Attribute (Watermark Syntax)

CREATE TABLE table_name (
  WATERMARK FOR  AS 
) WITH (
  ...
)

It marks an existing field  as the rowtime attribute, and the
watermark is generated by the expression .
 can be arbitrary expression which returns a
nullable BIGINT or TIMESTAMP as the watermark value.

For common cases, users can use the following expressions to define a
strategy.
1. Bounded Out of Orderness, the strategy can be "rowtimeField - INTERVAL
'string' timeUnit".
2. Preserve Watermark From Source, the strategy can be "SYSTEM_WATERMARK()".

## Proctime Attribute

CREATE TABLE table_name (
  ...
  proc AS SYSTEM_PROCTIME()
) WITH (
  ...
)

It uses the computed column syntax to add an additional column with
proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.

For more details and the implementations, please refer to the design doc:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba

Feel free to leave your further feedbacks!

Thanks,
Jark

On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:

> +1 to start vote process.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
>
> > Hi everyone,
> >
> > Thanks all for joining the discussion in the doc[1].
> > It seems that the discussion is converged and there is a consensus on the
> > current FLIP document.
> > If there is no objection, I would like to convert it into cwiki FLIP page
> > and start voting process.
> >
> > For more details, please refer to the design doc (it is slightly changed
> > since the initial proposal).
> >
> > Thanks,
> > Jark
> >
> > [1]:
> >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
> >
> > On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
> >
> > > After some review and discussion in the google document, I think it's
> > time
> > > to
> > > convert this design to a cwiki flip page and start voting process.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> > >
> > > > Hi all,
> > > >
> > > > Thanks all for so much feedbacks received in the doc so far.
> > > > I saw a general agreement on using computed column to support
> proctime
> > > > attribute and extract timestamps.
> > > > So we will prepare a computed column FLIP and share in the dev ML
> soon.
> > > >
> > > > Feel free to leave more comments!
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >
> > > > On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for bringing up this discussion and the detailed design doc.
> > > This
> > > > > is definitely a critical feature for streaming SQL jobs. I have
> left
> > a
> > > > few
> > > > > comments in the design doc.
> > > > >
> > > > > Thanks,
> > > > > Dian
> > > > >
> > > > > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > > > > >
> > > > > > Thanks Jark for this topic, This will be very useful.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > ForwardXu
> > > > > >
> > > > > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > > > > >
> > > > > >> Thanks Jark for bring up this topic, this is definitely an
> import
> > > > > feature
> > > > > >> for the SQL, especially the DDL users.
> > > > > >>
> > > > > >> I would spend some time to review this design doc, really
> thanks.
> > > > > >>
> > > > > >> Best,
> > > > > >> Danny Chan
> > > > > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > > > > >>> Hi everyone,
> > > > > >>>
> > > > > >>> I would like to start discussion about how to support time
> > > attribute
> > > > in
> > > > > >> SQL
> > > > > >>> DDL.
> > > > > >>> In Flink 1.9, we already introduced a basic SQL DDL to create a
> > > > table.
> > > > > >>> However, it doesn't support to define time attributes. This
> makes
> > > > users
> > > > > >>> can't
> > > > > >>> apply window operations on the tables created by DDL which is a
> > bad
> > > > > >>> experience.
> > > > > >>>
> > > > > >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> > > > > attribute
> > > > > >>> and propose to use computed column syntax to define proctime
> > > > attribute.
> > > > > >>> But computed column is another big topic and should deserve a
> > > > separate
> > > > > >>> FLIP.
> > > > > >>> If we have a consensus on the computed column approach, we will
> > > start
> > > > > >>> computed column FLIP soon.
> > > > > >>>
> > > > > >>> FLIP-66:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> > > > > >>>
> > > > > >>> Thanks for any feedback!
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Jark
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-18 Thread Kurt Young
+1 to start vote process.

Best,
Kurt


On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:

> Hi everyone,
>
> Thanks all for joining the discussion in the doc[1].
> It seems that the discussion is converged and there is a consensus on the
> current FLIP document.
> If there is no objection, I would like to convert it into cwiki FLIP page
> and start voting process.
>
> For more details, please refer to the design doc (it is slightly changed
> since the initial proposal).
>
> Thanks,
> Jark
>
> [1]:
>
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
>
> On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
>
> > After some review and discussion in the google document, I think it's
> time
> > to
> > convert this design to a cwiki flip page and start voting process.
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> >
> > > Hi all,
> > >
> > > Thanks all for so much feedbacks received in the doc so far.
> > > I saw a general agreement on using computed column to support proctime
> > > attribute and extract timestamps.
> > > So we will prepare a computed column FLIP and share in the dev ML soon.
> > >
> > > Feel free to leave more comments!
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >
> > > On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:
> > >
> > > > Hi Jark,
> > > >
> > > > Thanks for bringing up this discussion and the detailed design doc.
> > This
> > > > is definitely a critical feature for streaming SQL jobs. I have left
> a
> > > few
> > > > comments in the design doc.
> > > >
> > > > Thanks,
> > > > Dian
> > > >
> > > > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > > > >
> > > > > Thanks Jark for this topic, This will be very useful.
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > ForwardXu
> > > > >
> > > > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > > > >
> > > > >> Thanks Jark for bring up this topic, this is definitely an import
> > > > feature
> > > > >> for the SQL, especially the DDL users.
> > > > >>
> > > > >> I would spend some time to review this design doc, really thanks.
> > > > >>
> > > > >> Best,
> > > > >> Danny Chan
> > > > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > > > >>> Hi everyone,
> > > > >>>
> > > > >>> I would like to start discussion about how to support time
> > attribute
> > > in
> > > > >> SQL
> > > > >>> DDL.
> > > > >>> In Flink 1.9, we already introduced a basic SQL DDL to create a
> > > table.
> > > > >>> However, it doesn't support to define time attributes. This makes
> > > users
> > > > >>> can't
> > > > >>> apply window operations on the tables created by DDL which is a
> bad
> > > > >>> experience.
> > > > >>>
> > > > >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> > > > attribute
> > > > >>> and propose to use computed column syntax to define proctime
> > > attribute.
> > > > >>> But computed column is another big topic and should deserve a
> > > separate
> > > > >>> FLIP.
> > > > >>> If we have a consensus on the computed column approach, we will
> > start
> > > > >>> computed column FLIP soon.
> > > > >>>
> > > > >>> FLIP-66:
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> > > > >>>
> > > > >>> Thanks for any feedback!
> > > > >>>
> > > > >>> Best,
> > > > >>> Jark
> > > > >>
> > > >
> > > >
> > >
> >
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-18 Thread Jark Wu
Hi everyone,

Thanks all for joining the discussion in the doc[1].
It seems that the discussion is converged and there is a consensus on the
current FLIP document.
If there is no objection, I would like to convert it into cwiki FLIP page
and start voting process.

For more details, please refer to the design doc (it is slightly changed
since the initial proposal).

Thanks,
Jark

[1]:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd

On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:

> After some review and discussion in the google document, I think it's time
> to
> convert this design to a cwiki flip page and start voting process.
>
> Best,
> Kurt
>
>
> On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
>
> > Hi all,
> >
> > Thanks all for so much feedbacks received in the doc so far.
> > I saw a general agreement on using computed column to support proctime
> > attribute and extract timestamps.
> > So we will prepare a computed column FLIP and share in the dev ML soon.
> >
> > Feel free to leave more comments!
> >
> > Best,
> > Jark
> >
> >
> >
> > On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:
> >
> > > Hi Jark,
> > >
> > > Thanks for bringing up this discussion and the detailed design doc.
> This
> > > is definitely a critical feature for streaming SQL jobs. I have left a
> > few
> > > comments in the design doc.
> > >
> > > Thanks,
> > > Dian
> > >
> > > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > > >
> > > > Thanks Jark for this topic, This will be very useful.
> > > >
> > > >
> > > > Best,
> > > >
> > > > ForwardXu
> > > >
> > > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > > >
> > > >> Thanks Jark for bring up this topic, this is definitely an import
> > > feature
> > > >> for the SQL, especially the DDL users.
> > > >>
> > > >> I would spend some time to review this design doc, really thanks.
> > > >>
> > > >> Best,
> > > >> Danny Chan
> > > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > > >>> Hi everyone,
> > > >>>
> > > >>> I would like to start discussion about how to support time
> attribute
> > in
> > > >> SQL
> > > >>> DDL.
> > > >>> In Flink 1.9, we already introduced a basic SQL DDL to create a
> > table.
> > > >>> However, it doesn't support to define time attributes. This makes
> > users
> > > >>> can't
> > > >>> apply window operations on the tables created by DDL which is a bad
> > > >>> experience.
> > > >>>
> > > >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> > > attribute
> > > >>> and propose to use computed column syntax to define proctime
> > attribute.
> > > >>> But computed column is another big topic and should deserve a
> > separate
> > > >>> FLIP.
> > > >>> If we have a consensus on the computed column approach, we will
> start
> > > >>> computed column FLIP soon.
> > > >>>
> > > >>> FLIP-66:
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> > > >>>
> > > >>> Thanks for any feedback!
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>
> > >
> > >
> >
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-16 Thread Kurt Young
After some review and discussion in the google document, I think it's time
to
convert this design to a cwiki flip page and start voting process.

Best,
Kurt


On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:

> Hi all,
>
> Thanks all for so much feedbacks received in the doc so far.
> I saw a general agreement on using computed column to support proctime
> attribute and extract timestamps.
> So we will prepare a computed column FLIP and share in the dev ML soon.
>
> Feel free to leave more comments!
>
> Best,
> Jark
>
>
>
> On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:
>
> > Hi Jark,
> >
> > Thanks for bringing up this discussion and the detailed design doc. This
> > is definitely a critical feature for streaming SQL jobs. I have left a
> few
> > comments in the design doc.
> >
> > Thanks,
> > Dian
> >
> > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > >
> > > Thanks Jark for this topic, This will be very useful.
> > >
> > >
> > > Best,
> > >
> > > ForwardXu
> > >
> > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > >
> > >> Thanks Jark for bring up this topic, this is definitely an import
> > feature
> > >> for the SQL, especially the DDL users.
> > >>
> > >> I would spend some time to review this design doc, really thanks.
> > >>
> > >> Best,
> > >> Danny Chan
> > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > >>> Hi everyone,
> > >>>
> > >>> I would like to start discussion about how to support time attribute
> in
> > >> SQL
> > >>> DDL.
> > >>> In Flink 1.9, we already introduced a basic SQL DDL to create a
> table.
> > >>> However, it doesn't support to define time attributes. This makes
> users
> > >>> can't
> > >>> apply window operations on the tables created by DDL which is a bad
> > >>> experience.
> > >>>
> > >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> > attribute
> > >>> and propose to use computed column syntax to define proctime
> attribute.
> > >>> But computed column is another big topic and should deserve a
> separate
> > >>> FLIP.
> > >>> If we have a consensus on the computed column approach, we will start
> > >>> computed column FLIP soon.
> > >>>
> > >>> FLIP-66:
> > >>>
> > >>
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> > >>>
> > >>> Thanks for any feedback!
> > >>>
> > >>> Best,
> > >>> Jark
> > >>
> >
> >
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-09 Thread Jark Wu
Hi all,

Thanks all for so much feedbacks received in the doc so far.
I saw a general agreement on using computed column to support proctime
attribute and extract timestamps.
So we will prepare a computed column FLIP and share in the dev ML soon.

Feel free to leave more comments!

Best,
Jark



On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:

> Hi Jark,
>
> Thanks for bringing up this discussion and the detailed design doc. This
> is definitely a critical feature for streaming SQL jobs. I have left a few
> comments in the design doc.
>
> Thanks,
> Dian
>
> > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> >
> > Thanks Jark for this topic, This will be very useful.
> >
> >
> > Best,
> >
> > ForwardXu
> >
> > Danny Chan  于2019年9月6日周五 上午11:26写道:
> >
> >> Thanks Jark for bring up this topic, this is definitely an import
> feature
> >> for the SQL, especially the DDL users.
> >>
> >> I would spend some time to review this design doc, really thanks.
> >>
> >> Best,
> >> Danny Chan
> >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> >>> Hi everyone,
> >>>
> >>> I would like to start discussion about how to support time attribute in
> >> SQL
> >>> DDL.
> >>> In Flink 1.9, we already introduced a basic SQL DDL to create a table.
> >>> However, it doesn't support to define time attributes. This makes users
> >>> can't
> >>> apply window operations on the tables created by DDL which is a bad
> >>> experience.
> >>>
> >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> attribute
> >>> and propose to use computed column syntax to define proctime attribute.
> >>> But computed column is another big topic and should deserve a separate
> >>> FLIP.
> >>> If we have a consensus on the computed column approach, we will start
> >>> computed column FLIP soon.
> >>>
> >>> FLIP-66:
> >>>
> >>
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> >>>
> >>> Thanks for any feedback!
> >>>
> >>> Best,
> >>> Jark
> >>
>
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-05 Thread Dian Fu
Hi Jark,

Thanks for bringing up this discussion and the detailed design doc. This is 
definitely a critical feature for streaming SQL jobs. I have left a few 
comments in the design doc.

Thanks,
Dian

> 在 2019年9月6日,上午11:48,Forward Xu  写道:
> 
> Thanks Jark for this topic, This will be very useful.
> 
> 
> Best,
> 
> ForwardXu
> 
> Danny Chan  于2019年9月6日周五 上午11:26写道:
> 
>> Thanks Jark for bring up this topic, this is definitely an import feature
>> for the SQL, especially the DDL users.
>> 
>> I would spend some time to review this design doc, really thanks.
>> 
>> Best,
>> Danny Chan
>> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
>>> Hi everyone,
>>> 
>>> I would like to start discussion about how to support time attribute in
>> SQL
>>> DDL.
>>> In Flink 1.9, we already introduced a basic SQL DDL to create a table.
>>> However, it doesn't support to define time attributes. This makes users
>>> can't
>>> apply window operations on the tables created by DDL which is a bad
>>> experience.
>>> 
>>> In FLIP-66, we propose a syntax for watermark to define rowtime attribute
>>> and propose to use computed column syntax to define proctime attribute.
>>> But computed column is another big topic and should deserve a separate
>>> FLIP.
>>> If we have a consensus on the computed column approach, we will start
>>> computed column FLIP soon.
>>> 
>>> FLIP-66:
>>> 
>> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
>>> 
>>> Thanks for any feedback!
>>> 
>>> Best,
>>> Jark
>> 



Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-05 Thread Forward Xu
Thanks Jark for this topic, This will be very useful.


Best,

ForwardXu

Danny Chan  于2019年9月6日周五 上午11:26写道:

> Thanks Jark for bring up this topic, this is definitely an import feature
> for the SQL, especially the DDL users.
>
> I would spend some time to review this design doc, really thanks.
>
> Best,
> Danny Chan
> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > Hi everyone,
> >
> > I would like to start discussion about how to support time attribute in
> SQL
> > DDL.
> > In Flink 1.9, we already introduced a basic SQL DDL to create a table.
> > However, it doesn't support to define time attributes. This makes users
> > can't
> > apply window operations on the tables created by DDL which is a bad
> > experience.
> >
> > In FLIP-66, we propose a syntax for watermark to define rowtime attribute
> > and propose to use computed column syntax to define proctime attribute.
> > But computed column is another big topic and should deserve a separate
> > FLIP.
> > If we have a consensus on the computed column approach, we will start
> > computed column FLIP soon.
> >
> > FLIP-66:
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> >
> > Thanks for any feedback!
> >
> > Best,
> > Jark
>


Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-05 Thread Danny Chan
Thanks Jark for bring up this topic, this is definitely an import feature for 
the SQL, especially the DDL users.

I would spend some time to review this design doc, really thanks.

Best,
Danny Chan
在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> Hi everyone,
>
> I would like to start discussion about how to support time attribute in SQL
> DDL.
> In Flink 1.9, we already introduced a basic SQL DDL to create a table.
> However, it doesn't support to define time attributes. This makes users
> can't
> apply window operations on the tables created by DDL which is a bad
> experience.
>
> In FLIP-66, we propose a syntax for watermark to define rowtime attribute
> and propose to use computed column syntax to define proctime attribute.
> But computed column is another big topic and should deserve a separate
> FLIP.
> If we have a consensus on the computed column approach, we will start
> computed column FLIP soon.
>
> FLIP-66:
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
>
> Thanks for any feedback!
>
> Best,
> Jark


[DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-05 Thread Jark Wu
Hi everyone,

I would like to start discussion about how to support time attribute in SQL
DDL.
In Flink 1.9, we already introduced a basic SQL DDL to create a table.
However, it doesn't support to define time attributes. This makes users
can't
apply window operations on the tables created by DDL which is a bad
experience.

In FLIP-66, we propose a syntax for watermark to define rowtime attribute
and propose to use computed column syntax to define proctime attribute.
But computed column is another big topic and should deserve a separate
FLIP.
If we have a consensus on the computed column approach, we will start
computed column FLIP soon.

FLIP-66:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#

Thanks for any feedback!

Best,
Jark