Hi Matt, the combination of a tumbling time window and a count window is one way to define a sliding window. In your example of a 30 secs tumbling window and a (3,1) count window results in a time sliding window of 90 secs width and 30 secs slide.
You could define a time sliding window of 90 secs width and 1 secs slide on stream A to get a stream C with faster updates. If you still need stream B with the 30 secs tumbling window, you can have both windows defined on stream A. Hope this helps, Fabian 2016-12-16 12:58 GMT+01:00 Matt <dromitl...@gmail.com>: > I have reduced the problem to a simple image [1]. > > Those shown on the image are the streams I have, and the problem now is > how to create a custom window assigner such that objects in B that *don't > share* elements in A, are put together in the same window. > > Why? Because in order to create elements in C (triangles), I have to > process n *independent* elements of B (n=2 in the example). > > Maybe there's a better or simpler way to do this. Any idea is appreciated! > > Regards, > Matt > > [1] http://i.imgur.com/dG5AkJy.png > > On Thu, Dec 15, 2016 at 3:22 AM, Matt <dromitl...@gmail.com> wrote: > >> Hello, >> >> I have a rather simple problem with a difficult explanation... >> >> I have 3 streams, one of objects of class A (stream A), one of class B >> (stream B) and one of class C (stream C). The elements of A are generated >> at a rate of about 3 times every second. Elements of type B encapsulates >> some key features of the stream A (like the number of elements of A in the >> window) during the last 30 seconds (tumbling window 30s). Finally, the >> elements of type C contains statistics (for simplicity let's say the >> average of elements processed by each element in B) of the last 3 elements >> in B and are produced on every new element of B (count window 3, 1). >> >> Illustrative example, () and [] denotes windows: >> >> ... [a10 a9 a8] [a7 a6] [a5 a4] [a3 a2 a1] >> ... (b4 [b3 b2) b1] >> ... [c2] [c1] >> >> This works fine, except for a dashboard that depends on the elements of C >> to be updated, and 30s is way too big of a delay. I thought I could change >> the tumbling window for a sliding window of size 30s and a slide of 1s, but >> this doesn't work. >> >> If I use a sliding window to create elements of B as mentioned, each >> count window would contain 3 elements of B, and I would get one element of >> C every second as intended, but those elements in B encapsulates almost the >> same elements of A. This results in stats that are wrong. >> >> For instance, c1 may have the average of b1, b2 and b3. But b1, b2 and b3 >> share most of the elements from stream A. >> >> Question: is there any way to create a count window with the last 3 >> elements of B that would have gone into the same tumbling window, not with >> the last 3 consecutive elements? >> >> I hope the problem is clear, don't hesitate to ask for further >> clarification! >> >> Regards, >> Matt >> >> >