Hi all,

I'm on vacation for about five days , sorry to have missed this great FLIP.

Yes, the non-windowed aggregates is a special case of row-window. And the 
proposal looks really good.  Can we have a simplified form for the special 
case? Such as : 
table.groupBy(‘a).rowWindow(SlideRows.unboundedPreceding).select(…)  can be 
simplified to  table.groupBy(‘a).select(…). The latter will actually call the 
former. 

Another question is about the rowtime. As the FLIP said, DataStream and 
StreamTableSource is responsible to assign timestamps and watermarks, 
furthermore “rowtime” and “systemtime” are not real column. IMO, it is 
different with Calcite’s rowtime, which is a real column in the table. In 
FLIP's way, we will lose some flexibility. Because the timestamp column may be 
created after some transformations or join operation, not created at beginning. 
So why do we have to define rowtime at beginning? (because of watermark?)     
Can we have a way to define rowtime after source table like TimestampAssinger?

Regarding to “allowLateness” method. I come up a trick that we can make 
‘rowtime and ‘system to be a Scala object, not a symbol expression. The API 
will looks like this : 

window(Tumble over 10.minutes on rowtime allowLateness as ‘w) 

The implementation will look like this:

class TumblingWindow(size: Expression) extends Window {
  def on(time: rowtime.type): TumblingEventTimeWindow = 
      new TumblingEventTimeWindow(alias, ‘rowtime, size)        // has 
allowLateness() method

  def on(time: systemtime.type): TumblingProcessingTimeWindow= 
     new TumblingProcessingTimeWindow(alias, ‘systemtime, size)         // 
hasn’t allowLateness() method
}
object rowtime
object systemtime

What do you think about this?

- Jark Wu 

> 在 2016年9月6日,下午11:00,Timo Walther <twal...@apache.org> 写道:
> 
> Hi all,
> 
> I thought about the API of the FLIP again. If we allow the "systemtime" 
> attribute, we cannot implement a nice method chaining where the user can 
> define a "allowLateness" only on event time. So even if the user expressed 
> that "systemtime" is used we have to offer a "allowLateness" method because 
> we have to assume that this attribute can also be the batch event time 
> column, which is not very nice.
> 
> class TumblingWindow(size: Expression) extends Window {
>  def on(timeField: Expression): TumblingEventTimeWindow =
>    new TumblingEventTimeWindow(alias, timeField, size) // has allowLateness() 
> method
> }
> 
> What do you think?
> 
> Timo
> 
> 
> Am 05/09/16 um 10:41 schrieb Fabian Hueske:
>> Hi Jark,
>> 
>> you had asked for non-windowed aggregates in the Table API a few times.
>> FLIP-11 proposes row-window aggregates which are a generalization of
>> running aggregates (SlideRow unboundedPreceding).
>> 
>> Can you have a look at the FLIP and give feedback whether this is what you
>> are looking for?
>> Improvement suggestions are very welcome as well.
>> 
>> Thank you,
>> Fabian
>> 
>> 2016-09-01 16:12 GMT+02:00 Timo Walther <twal...@apache.org>:
>> 
>>> Hi all!
>>> 
>>> Fabian and I worked on a FLIP for Stream Aggregations in the Table API.
>>> You can find the FLIP-11 here:
>>> 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%
>>> 3A+Table+API+Stream+Aggregations
>>> 
>>> Motivation for the FLIP:
>>> 
>>> The Table API is a declarative API to define queries on static and
>>> streaming tables. So far, only projection, selection, and union are
>>> supported operations on streaming tables.
>>> 
>>> This FLIP proposes to add support for different types of aggregations on
>>> top of streaming tables. In particular, we seek to support:
>>> 
>>> - Group-window aggregates, i.e., aggregates which are computed for a group
>>> of elements. A (time or row-count) window is required to bound the infinite
>>> input stream into a finite group.
>>> 
>>> - Row-window aggregates, i.e., aggregates which are computed for each row,
>>> based on a window (range) of preceding and succeeding rows.
>>> Each type of aggregate shall be supported on keyed/grouped or
>>> non-keyed/grouped data streams for streaming tables as well as batch tables.
>>> 
>>> We are looking forward to your feedback.
>>> 
>>> Timo
>>> 
> 
> 
> -- 
> Freundliche Grüße / Kind Regards
> 
> Timo Walther
> 
> Follow me: @twalthr
> https://www.linkedin.com/in/twalthr

Reply via email to