Hi,
I’m not sure this is a problem. If a user specifies sliding windows then one
element can (and will) end up in several windows. If these are joined then
there will be multiple results. If the user does not want multiple windows then
tumbling windows should be used.
IMHO, this is quite
Stephan is right. A tumbling window does not help. The last tuple of
window n and the first tuple of window n+1 are "close" to each other and
should be joined for example.
From a SQL-like point of view this is a very common case expressed as:
SELECT * FROM s1,s2 WHERE s1.key = s2.key AND |s1.ts
Since sessions are built per key, they have groups of keys that are close
enough together in time. They will, however, treat the closeness
transitively...
On Tue, Nov 24, 2015 at 11:33 AM, Matthias J. Sax wrote:
> Stephan is right. A tumbling window does not help. The last
I understand Matthias' point. You want to join elements that occur within a
time range of each other.
In a tumbling window, you have strict boundaries and a pair of elements
that arrives such that one element is before the boundary and one after,
they will not join. Hence the sliding windows.
Hi,
it seems that a join on the data streams with an overlapping sliding
window produces duplicates in the output. The default implementation
internally just use two nested-loops over both windows to compute the
result.
How can duplicates be avoided? Is there any way after all right now? If
not,