Does "Repeatedly.forever(AfterPane.elementCountAtLeast(1)" solve this?  At
least in my tests it seems like this correctly only emits a single element
per pane, but I'm not sure how much of a guarantee there actually is that
there will never be more than N elements in a pane when
elementCountAtLeast(N) is set.

On Tue, Feb 22, 2022 at 2:06 PM Luke Cwik <[email protected]> wrote:

> I'm not certain that Latest would work since the processing time trigger
> would still cause multiple firings to occur each producing the "latest" at
> that point in time. All these firings would effectively be output to the
> PCollection that the view is over. The PCollection would effectively be a
> concatenation of all these firings.
>
>
>
> On Tue, Feb 22, 2022 at 10:57 AM Pavel Solomin <[email protected]>
> wrote:
>
>> I also did not succeed in making this pattern work some time ago. In the
>> link below there's my mail thread with code example - do you have a similar
>> use-case?
>>
>> https://lists.apache.org/thread/9l74o4vqbtfgc5vkj9qq0xofffmtxswc
>>
>> Will keep watching this thread for insights.
>>
>> Best Regards,
>> Pavel Solomin
>>
>> Tel: +351 962 950 692 <+351%20962%20950%20692> | Skype: pavel_solomin |
>> Linkedin <https://www.linkedin.com/in/pavelsolomin>
>>
>>
>>
>>
>>
>> On Tue, 22 Feb 2022 at 18:46, Steve Niemitz <[email protected]> wrote:
>>
>>> We had a team try to use the "slowly updating global window side inputs"
>>> pattern (on dataflow) to update some metadata in their pipeline every
>>> minute, but surprisingly ran into errors that the side input PCollection
>>> contained more than one element, [1] although this only manifested
>>> intermittently.
>>>
>>> My theory on why this breaks is as follows, can someone check my logic?
>>>
>>> Given that GenerateSequence operates on processing time, (although this
>>> might not actually matter) it's possible that if processing the source is
>>> delayed for whatever reason, the source may emit multiple elements at once
>>> in a single bundle.  For example, if I configure the source to generate an
>>> element every 10 seconds, and the evaluation of the source is delayed for
>>> 30 seconds, I'd get a bundle with 3 elements in it. (or so it seems)  All
>>> elements are then windowed into the global window, so they all end up in
>>> the same window.
>>>
>>> If a bundle with 3 elements enters
>>> the AfterProcessingTime.pastFirstElementInPane() state machine, all 3
>>> elements will be emitted in that pane.  This will then propagate down and
>>> break on the singleton view combiner.
>>>
>>> Is my thought process here correct?  Is the example here just buggy?
>>>
>>> [1] "pcollection view being accessed as a singleton despite having more
>>> than one input."
>>>
>>

Reply via email to