We had a team try to use the "slowly updating global window side inputs" pattern (on dataflow) to update some metadata in their pipeline every minute, but surprisingly ran into errors that the side input PCollection contained more than one element, [1] although this only manifested intermittently.
My theory on why this breaks is as follows, can someone check my logic? Given that GenerateSequence operates on processing time, (although this might not actually matter) it's possible that if processing the source is delayed for whatever reason, the source may emit multiple elements at once in a single bundle. For example, if I configure the source to generate an element every 10 seconds, and the evaluation of the source is delayed for 30 seconds, I'd get a bundle with 3 elements in it. (or so it seems) All elements are then windowed into the global window, so they all end up in the same window. If a bundle with 3 elements enters the AfterProcessingTime.pastFirstElementInPane() state machine, all 3 elements will be emitted in that pane. This will then propagate down and break on the singleton view combiner. Is my thought process here correct? Is the example here just buggy? [1] "pcollection view being accessed as a singleton despite having more than one input."
