[ 
https://issues.apache.org/jira/browse/FLINK-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571848#comment-16571848
 ] 

Fabian Hueske commented on FLINK-8482:
--------------------------------------

In my opinion this is a necessary feature to be able to define the meaningful 
application semantics for certain use cases.
For example, left or right timestamp are required to specify which sides 
timestamp should be used when defining a a time-based operation such as a 
window on the result of an interval join.

> Implement and expose option to use min / max / left / right timestamp for 
> joined streamrecords
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-8482
>                 URL: https://issues.apache.org/jira/browse/FLINK-8482
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Florian Schmidt
>            Assignee: Florian Schmidt
>            Priority: Major
>
> The idea: Expose the option of which timestamp to use for the result of a 
> join. The idea that is currently the floating around includes the options
>  * _left_: Use timestamp of the element in a join that came from the left 
> stream
>  * _right_: Use timestamp of the element in a join that came from the right 
> stream
>  * _max_: Use the max timestamp of both elements in a join
>  * _min_: Use the max timestamp of both elements in a join
> All options but _max_ require to introduce delaying watermarks in the 
> operator, which is something that we were hesitant to do until now. This 
> should probably under go discussion once more in order to see if / how we 
> want to add this now. We could even think of exposing this in a more general 
> way by adding a base operator that allows delayed watermarks.
> This will also be groundwork for supporting outer joins (FLINK-8483) for 
> which in any case we watermark delays to provide correctness. 
> Also the API for this needs some feedback in order to expose this in a 
> powerful, yet clear way. In my PoC at [1] I used the naming convention left / 
> right to refer to specific streams with currently is not something the api 
> exposes to the user, we should probably use something more clever here.
> Example
> {code:java}
> keyedStreamOne.
>    .intervalJoin(keyedStreamTwo)
>    .between(Time.milliseconds(0), Time.milliseconds(2))
>    .assignMinTimestamp() // alternative .assignMaxTimestamp() 
> .assignLeftTimestamp() .assignRightTimestamp()
>    .process(new ProcessJoinFunction() { /* impl */ })
> {code}
>  
> Any feedback is highly appreciated!
> [1] 
> https://github.com/florianschmidt1994/flink/tree/flink-8482-add-option-for-different-timestamp-strategies-to-interval-join-operator
> cc [~StephanEwen] [~kkl0u]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to