My simplified scenario is that I want to count how many A and B events I am
receiving each minute. I also need to support late data.

Let's assume I am receiving 2 A and 2 B events in the same window
(00:00:00-00:00:001)
So the expected result is two A events , and two B events

Then I receive 2 late B events for the same (00:00:00-00:00:001) window

In the final pane I expect two A events and four B events.
But actually I am getting some accumulated results from previous panes

Below I outlined the actual and expected result.

/**
 *
 * Panes created by job
 * ./00:00-00:01-pane-0-on_time-first
 KV{A, 2}
 KV{B, 2}
 ./00:00-00:01-pane-1-late
 KV{A, 2}
 KV{B, 2}
 KV{B, 3}
 ./00:00-00:01-pane-2-late
 KV{A, 2}
 KV{B, 2}
 KV{B, 3}
 KV{B, 4}
 *
 *
 */

/**
 *
 * The expected result
 * ./00:00-00:01-pane-0-on_time-first
 KV{A, 2}
 KV{B, 2}
 ./00:00-00:01-pane-1-late
 KV{A, 2}
 KV{B, 3}
 ./00:00-00:01-pane-2-late
 KV{A, 2}
 KV{B, 4}
 *
 *
 */


How do I need to change the job to get the expected result?


You can found the test code here https://gist.github.com/pbartoszek/
795452a5015e60dcf8c2112ff941250e

I am using Beam 2.0.0 and TestRunner with TestStream

Thanks,
Pawel

Reply via email to