Hi Liam, I took a quick look. On the output side, it looks like you’re adding the count to the prior count. Should that just set the outbound vale to the new count? Maybe I misunderstood the situation.
What I mean is, suppose you get two events for the same window: Inbound map := 0+1 = 1 Count = 1 Outbound map := 0+1 = 1 (Proposed outbound := 1) Then, Inbound map := 1+1 = 2 Count = 2 Outbound map := 1+2 = 3 (Proposed outbound := 2) Does that make sense? -John On Sun, Apr 19, 2020, at 03:08, Liam Clarke wrote: > Hello all, > > I have been running this code against production data, and I'm emitting > counts/sums for a sentinel record id to stdout so I can observe the > behaviour: > > https://gist.github.com/LiamClarkeNZ/b101ce6a42a2e5e1efddfe3a98c5805f > > When this code is run, the window duration is 2 minutes, grace period is 20 > seconds, and retention time is 20 minutes. > > I am endeavouring to use event time as the timestamp basis for this process: > https://gist.github.com/LiamClarkeNZ/8265cec02e21f5969e0fedb8281a2180 > > So, my sentinel debugging output shows a surprising behaviour in that the > outbound counts for the key always sum higher than the inbound count. For > example: > > Sample: 2020-04-19T07:31:37.492Z > > Inbound > { > 2020-04-19T03:00:00Z=4563, > 2020-04-19T04:00:00Z=5629, > 2020-04-19T05:00:00Z=8489, > 2020-04-19T06:00:00Z=13599 > } > > Outbound > { > 2020-04-19T03:00:00Z=4717, > 2020-04-19T04:00:00Z=5890, > 2020-04-19T05:00:00Z=8826, > 2020-04-19T06:00:00Z=13951 > } > > This makes me suspect that either I'm not using the window I thought I was > (e.g., I'm somehow using a sliding window instead of a tumbling window) or > that I have made a rookie error somewhere in my aggregations, or I've just > misunderstood something about this. Does it matter that the window size in > the persistent window store doesn't match the windowing time + grace time > in the windowing clause? > > Any pointers gratefully welcome. > > Kind regards, > > Liam Clarke-Hutchinson >