[
https://issues.apache.org/jira/browse/FLINK-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Florian Schmidt updated FLINK-8482:
-----------------------------------
Fix Version/s: 1.7.0
> Implement and expose option to use min / max / left / right timestamp for
> joined streamrecords
> ----------------------------------------------------------------------------------------------
>
> Key: FLINK-8482
> URL: https://issues.apache.org/jira/browse/FLINK-8482
> Project: Flink
> Issue Type: Sub-task
> Reporter: Florian Schmidt
> Assignee: Florian Schmidt
> Priority: Major
> Fix For: 1.7.0
>
>
> The idea: Expose the option of which timestamp to use for the result of a
> join. The idea that is currently the floating around includes the options
> * _left_: Use timestamp of the element in a join that came from the left
> stream
> * _right_: Use timestamp of the element in a join that came from the right
> stream
> * _max_: Use the max timestamp of both elements in a join
> * _min_: Use the max timestamp of both elements in a join
> All options but _max_ require to introduce delaying watermarks in the
> operator, which is something that we were hesitant to do until now. This
> should probably under go discussion once more in order to see if / how we
> want to add this now. We could even think of exposing this in a more general
> way by adding a base operator that allows delayed watermarks.
> This will also be groundwork for supporting outer joins (FLINK-8483) for
> which in any case we watermark delays to provide correctness.
> Also the API for this needs some feedback in order to expose this in a
> powerful, yet clear way. In my PoC at [1] I used the naming convention left /
> right to refer to specific streams with currently is not something the api
> exposes to the user, we should probably use something more clever here.
> Example
> {code:java}
> keyedStreamOne.
> .intervalJoin(keyedStreamTwo)
> .between(Time.milliseconds(0), Time.milliseconds(2))
> .assignMinTimestamp() // alternative .assignMaxTimestamp()
> .assignLeftTimestamp() .assignRightTimestamp()
> .process(new ProcessJoinFunction() { /* impl */ })
> {code}
>
> Any feedback is highly appreciated!
> [1]
> https://github.com/florianschmidt1994/flink/tree/flink-8482-add-option-for-different-timestamp-strategies-to-interval-join-operator
> cc [~StephanEwen] [~kkl0u]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)