Does "Repeatedly.forever(AfterPane.elementCountAtLeast(1)" solve this? At least in my tests it seems like this correctly only emits a single element per pane, but I'm not sure how much of a guarantee there actually is that there will never be more than N elements in a pane when elementCountAtLeast(N) is set.
On Tue, Feb 22, 2022 at 2:06 PM Luke Cwik <[email protected]> wrote: > I'm not certain that Latest would work since the processing time trigger > would still cause multiple firings to occur each producing the "latest" at > that point in time. All these firings would effectively be output to the > PCollection that the view is over. The PCollection would effectively be a > concatenation of all these firings. > > > > On Tue, Feb 22, 2022 at 10:57 AM Pavel Solomin <[email protected]> > wrote: > >> I also did not succeed in making this pattern work some time ago. In the >> link below there's my mail thread with code example - do you have a similar >> use-case? >> >> https://lists.apache.org/thread/9l74o4vqbtfgc5vkj9qq0xofffmtxswc >> >> Will keep watching this thread for insights. >> >> Best Regards, >> Pavel Solomin >> >> Tel: +351 962 950 692 <+351%20962%20950%20692> | Skype: pavel_solomin | >> Linkedin <https://www.linkedin.com/in/pavelsolomin> >> >> >> >> >> >> On Tue, 22 Feb 2022 at 18:46, Steve Niemitz <[email protected]> wrote: >> >>> We had a team try to use the "slowly updating global window side inputs" >>> pattern (on dataflow) to update some metadata in their pipeline every >>> minute, but surprisingly ran into errors that the side input PCollection >>> contained more than one element, [1] although this only manifested >>> intermittently. >>> >>> My theory on why this breaks is as follows, can someone check my logic? >>> >>> Given that GenerateSequence operates on processing time, (although this >>> might not actually matter) it's possible that if processing the source is >>> delayed for whatever reason, the source may emit multiple elements at once >>> in a single bundle. For example, if I configure the source to generate an >>> element every 10 seconds, and the evaluation of the source is delayed for >>> 30 seconds, I'd get a bundle with 3 elements in it. (or so it seems) All >>> elements are then windowed into the global window, so they all end up in >>> the same window. >>> >>> If a bundle with 3 elements enters >>> the AfterProcessingTime.pastFirstElementInPane() state machine, all 3 >>> elements will be emitted in that pane. This will then propagate down and >>> break on the singleton view combiner. >>> >>> Is my thought process here correct? Is the example here just buggy? >>> >>> [1] "pcollection view being accessed as a singleton despite having more >>> than one input." >>> >>
