Hello,

In chapter 4 of the Streaming Systems book by Tyler, Slava and Reuven there
is an example 4-6 on page 111 about custom windowing that deals with
UnalignedFixedWindows:
https://www.oreilly.com/library/view/streaming-systems/9781491983867/ch04.html

Unfortunately that example is abbreviated and the full source code is not
published in this repo:
https://github.com/takidau/streamingbook

I am joining two Kafka Streams and I am currently windowing them by fixed
time intervals. However the elements in stream one ("articles") are
published first, then the assets for those articles are being published in
the "assets" topic. Articles event timestamps are therefore slightly before
those of assets.

Now when doing a CoGroupByKey this can lead to a situation where an article
is not being processed together with its assets, as

- the article has a timestamp of 2020-10-02T00:30:29.997Z
- the assets have a timestamp of 2020-10-02T00:30:30.001Z

This is a must in my pipeline as I am relying on them to be processed
together - otherwise I am publishing an article without it's assets.

My idea was therefore to apply UnalignedFixedWindows instead of fixed ones
to the streams to circumvent this. What I am currently missing is the
mergeWindows() implementation or the full source code to understand it. I
am currently facing a java.lang.IllegalStateException

TimestampCombiner moved element from 2020-10-02T09:32:36.079Z to earlier
time 2020-10-02T09:32:03.365Z for window
[2020-10-02T09:31:03.366Z..2020-10-02T09:32:03.366Z)

Which gives me the impression that I am doing something wrong or have not
fully understood the custom windowing topic.

Am I on the wrong track here?

Reply via email to