[
https://issues.apache.org/jira/browse/FLINK-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191368#comment-15191368
]
ASF GitHub Bot commented on FLINK-3109:
---------------------------------------
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/1527#issuecomment-195495184
I just saw that you updated this pull request (actually a few weeks ago
already)
A lot of it looks very good, some things we need to check a bit deeper
(like how triggers actually behave on the two separate windows, how windows are
matched).
Can you give a high level summary of how this should behave?
Especially given that you allow for custom triggers and window assigners
here, how are windows matched against each other (to determine that their
elements should be joined/co-grouped).
For tumbling time windows, the behavior is well defined and like discussed
in the JIRA issue, but for generic windows and triggers, how is it defined?
> Join two streams with two different buffer time
> -----------------------------------------------
>
> Key: FLINK-3109
> URL: https://issues.apache.org/jira/browse/FLINK-3109
> Project: Flink
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 0.10.1
> Reporter: Wang Yangjun
> Labels: easyfix, patch
> Fix For: 0.10.2
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> Current Flink streaming only supports join two streams on the same window.
> How to solve this problem?
> For example, there are two streams. One is advertisements showed to users.
> The tuple in which could be described as (id, showed timestamp). The other
> one is click stream -- (id, clicked timestamp). We want get a joined stream,
> which includes all the advertisement that is clicked by user in 20 minutes
> after showed.
> It is possible that after an advertisement is shown, some user click it
> immediately. It is possible that "click" message arrives server earlier than
> "show" message because of Internet delay. We assume that the maximum delay is
> one minute.
> Then the need is that we should alway keep a buffer(20 mins) of "show" stream
> and another buffer(1 min) of "click" stream.
> It would be grate that there is such an API like.
> showStream.join(clickStream)
> .where(keySelector)
> .buffer(Time.of(20, TimeUnit.MINUTES))
> .equalTo(keySelector)
> .buffer(Time.of(1, TimeUnit.MINUTES))
> .apply(JoinFunction)
> http://stackoverflow.com/questions/33849462/how-to-avoid-repeated-tuples-in-flink-slide-window-join/34024149#34024149
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)