I have been preparing a PR to add the timeout option. I had a dumb question - seems to me that the timeout should be set in processing time while the existing timer fired at the window expiration is in event time. Is there a way to have timers in different time domains? -- Best regards, Siyuan
On Wed, Aug 26, 2020 at 9:15 AM Reuven Lax <[email protected]> wrote: > Seems reasonable to add an optional timeout to GroupIntoBatches to flush > records. > > On Wed, Aug 26, 2020 at 9:04 AM Robert Bradshaw <[email protected]> > wrote: > >> GroupIntoBatches sets a timer to flush the batches at the end of the >> window [1] no matter how many elements there are. This could cause a >> problem for the GlobalWindow if no more data ever comes in. >> >> [1] >> https://github.com/apache/beam/blob/release-2.23.0/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java#L116 >> >> On Wed, Aug 26, 2020 at 8:55 AM Alex Amato <[email protected]> wrote: >> > >> > How does groupIntoBatches behave when there are too few elements for a >> key (less than the provided batch size)? >> > >> > Based on how its described. Its not clear to me that the elements will >> ever emit. Can this cause stuckness in this case? >> >
