Seems reasonable to add an optional timeout to GroupIntoBatches to flush records.
On Wed, Aug 26, 2020 at 9:04 AM Robert Bradshaw <[email protected]> wrote: > GroupIntoBatches sets a timer to flush the batches at the end of the > window [1] no matter how many elements there are. This could cause a > problem for the GlobalWindow if no more data ever comes in. > > [1] > https://github.com/apache/beam/blob/release-2.23.0/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java#L116 > > On Wed, Aug 26, 2020 at 8:55 AM Alex Amato <[email protected]> wrote: > > > > How does groupIntoBatches behave when there are too few elements for a > key (less than the provided batch size)? > > > > Based on how its described. Its not clear to me that the elements will > ever emit. Can this cause stuckness in this case? >
