[ 
https://issues.apache.org/jira/browse/FLINK-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179836#comment-16179836
 ] 

Fabian Hueske commented on FLINK-7548:
--------------------------------------

Thanks for your thoughts [~xccui].
I added a few comments to your suggestions / questions.

Thanks, Fabian

bq. Considering that the data type should be preserved, it may bring extra 
logic if we do that internally. To keep the consistency, I wonder if it's 
possible to encapsulate the time into a new Rowtime<T> type. It exposes two 
methods, getTime(): Long for logical level use and getValue(): T for physical 
level use.

In fact, {{Long}} and {{Timestamp}} have the same internal representation, 
namely {{Long}}. The issue is more the type that is exposed to SQL or the Table 
API. We would need a new TimeIndicator type that exposes a timestamp as 
{{Long}}.

bq. Besides, I think the watermark generation should not be bound with rowtime 
extraction. Compared with implementing them in a single scan operator (not sure 
if I understood correctly), I prefer to generate watermarks in extra operators. 
That should be more flexible.

Timestamp extraction and watermark generation would not be tight together. 
First, we would compute timestamps (only necessary if we don't use an existing 
field). The next step would extract watermarks. However, both operations would 
happen in the logical scan operator because a single operator can be translated 
into multiple DataStream operations.

bq. I am thinking of a new record number bounded out-of-order generation 
strategy. Do you think it will be useful in real applications?

How would this strategy work? IMO, built-in strategies should have a concrete 
use case in mind which is common enough to justify a built-in primitive.

bq. I still feel that the machine time is not compatible with the rowtime 
watermark generation. Shall we consider getting rid of it?

Machine time (assuming that you refer to processing time here) does not use 
watermarks. Watermarks are only used for event-time processing. 

> Support watermark generation for TableSource
> --------------------------------------------
>
>                 Key: FLINK-7548
>                 URL: https://issues.apache.org/jira/browse/FLINK-7548
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>            Reporter: Jark Wu
>
> As discussed in FLINK-7446, currently the TableSource only support to define 
> rowtime field, but not support to extract watermarks from the rowtime field. 
> We can provide a new interface called {{DefinedWatermark}}, which has two 
> methods {{getRowtimeAttribute}} (can only be an existing field) and 
> {{getWatermarkGenerator}}. The {{DefinedRowtimeAttribute}} will be marked 
> deprecated.
> How to support periodic and punctuated watermarks and support some built-in 
> strategies needs further discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to