Fei Pan created STORM-2761:
------------------------------
Summary: JoinBolt.java 's paradigm is new model of stream join?
Key: STORM-2761
URL: https://issues.apache.org/jira/browse/STORM-2761
Project: Apache Storm
Issue Type: Question
Components: storm-client
Reporter: Fei Pan
Priority: Critical
Hi, I am a researcher from University of Toronto and I am studying acceleration
on stream processing platform. I have a question about the model of
window-based stream join used in the JoinBolt.java. From my understanding, when
a new tuple arrived, we join this new tuple with all the tuples in the window
of the opposite stream. However, in the JoinBolt.java, not only the new tuple,
but the tuples in the entire local window will join with the window of the
opposite stream. This actually produces a lot of duplicated results, since most
of the old tuples in the local window have joined before. I don't know if this
is a new paradigm or the storm's team misunderstood the model of stream join.
Can someone help me to clarify this question?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)