Window assignment happens at the point in the pipeline the WindowInto
transform was applied. So in this case the window would have been assigned
using the original timestamp.

Grouping is by key and window.

On Tue, Jan 14, 2020 at 7:30 PM Ankur Goenka <goe...@google.com> wrote:

> Hi,
>
> I am not sure about the effect of the order of element timestamp change
> and window association has on a group by key.
> More specifically, what would be the behavior if we apply window -> change
> element timestamp -> Group By key.
> I think we should always apply window function after changing the
> timestamp of elements. Though this is neither checked nor a recommended
> practice in Beam.
>
> Example pipeline would look like this:
>
>       def applyTimestamp(value):
>             return window.TimestampedValue((key, value), int(time.time())
>
>         p \
>             | 'Create' >> beam.Create(range(0, 10)) \
>             | 'Fixed Window' >> beam.WindowInto(window.FixedWindows(5)) \
>             | 'Apply Timestamp' >> beam.Map(applyTimestamp) \ # Timestamp
> is changed after windowing and before GBK
>             | 'Group By Key' >> beam.GroupByKey() \
>             | 'Print' >> beam.Map(print)
>
> Thanks,
> Ankur
>

Reply via email to