[ 
https://issues.apache.org/jira/browse/STORM-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181944#comment-16181944
 ] 

Arun Mahadevan commented on STORM-2761:
---------------------------------------

As per my understanding the tuples are buffered in both streams and joined only 
once when the window triggers.

e.g with a 1 min window all tuples that arrived in the last 1 min in "stream1" 
is joined with all the tuples that arrived in the last 1 min in "stream2" when 
the 1 min completes. If it does not work that way there might be a bug.

cc [~roshan_naik]

> JoinBolt.java 's paradigm is new model of stream join?
> ------------------------------------------------------
>
>                 Key: STORM-2761
>                 URL: https://issues.apache.org/jira/browse/STORM-2761
>             Project: Apache Storm
>          Issue Type: Question
>          Components: storm-client
>            Reporter: Fei Pan
>            Priority: Critical
>
> Hi, I am a researcher from University of Toronto and I am studying 
> acceleration on stream processing platform. I have a question about the model 
> of window-based stream join used in the JoinBolt.java. From my understanding, 
> when a new tuple arrived, we join this new tuple with all the tuples in the 
> window of the opposite stream. However, in the JoinBolt.java, not only the 
> new tuple, but the tuples in the entire local window will join with the 
> window of the opposite stream. This actually produces a lot of duplicated 
> results, since most of the old tuples in the local window have joined before. 
> I don't know if this is a new paradigm or the storm's team misunderstood the 
> model of stream join. Can someone help me to clarify this question?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to