I understand Matthias' point. You want to join elements that occur within a
time range of each other.

In a tumbling window, you have strict boundaries and a pair of elements
that arrives such that one element is before the boundary and one after,
they will not join. Hence the sliding windows.

What may be a solution here is a "session window" join...

On Tue, Nov 24, 2015 at 10:33 AM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> Hi,
> I’m not sure this is a problem. If a user specifies sliding windows then
> one element can (and will) end up in several windows. If these are joined
> then there will be multiple results. If the user does not want multiple
> windows then tumbling windows should be used.
>
> IMHO, this is quite straightforward. But let’s see what others have to say.
>
> Cheers,
> Aljoscha
> > On 23 Nov 2015, at 20:36, Matthias J. Sax <mj...@apache.org> wrote:
> >
> > Hi,
> >
> > it seems that a join on the data streams with an overlapping sliding
> > window produces duplicates in the output. The default implementation
> > internally just use two nested-loops over both windows to compute the
> > result.
> >
> > How can duplicates be avoided? Is there any way after all right now? If
> > not, should be add this?
> >
> >
> > -Matthias
> >
>
>

Reply via email to