If an element is emitted with a timestamp, the window assignment is
re-applied at that time. At least that's how it is in Python. You can
emit the full windowed value (accepted without checking...), a
timestamped value (in which case the window will be computed), or a
plain old element (in which case the window and timestamp will be
computed (really, propagated)).

On Wed, Jan 15, 2020 at 3:51 PM Ankur Goenka <goe...@google.com> wrote:
>
> Yup, This might result in unintended behavior as timestamp is changed after 
> the window assignment as elements in windows do not have timestamp in the 
> window time range.
>
> Shall we start validating atleast one window assignment between timestamp 
> assignment and GBK/triggers to avoid unintended behaviors mentioned above?
>
> On Wed, Jan 15, 2020 at 1:24 PM Luke Cwik <lc...@google.com> wrote:
>>
>> Window assignment happens at the point in the pipeline the WindowInto 
>> transform was applied. So in this case the window would have been assigned 
>> using the original timestamp.
>>
>> Grouping is by key and window.
>>
>> On Tue, Jan 14, 2020 at 7:30 PM Ankur Goenka <goe...@google.com> wrote:
>>>
>>> Hi,
>>>
>>> I am not sure about the effect of the order of element timestamp change and 
>>> window association has on a group by key.
>>> More specifically, what would be the behavior if we apply window -> change 
>>> element timestamp -> Group By key.
>>> I think we should always apply window function after changing the timestamp 
>>> of elements. Though this is neither checked nor a recommended practice in 
>>> Beam.
>>>
>>> Example pipeline would look like this:
>>>
>>>       def applyTimestamp(value):
>>>             return window.TimestampedValue((key, value), int(time.time())
>>>
>>>         p \
>>>             | 'Create' >> beam.Create(range(0, 10)) \
>>>             | 'Fixed Window' >> beam.WindowInto(window.FixedWindows(5)) \
>>>             | 'Apply Timestamp' >> beam.Map(applyTimestamp) \ # Timestamp 
>>> is changed after windowing and before GBK
>>>             | 'Group By Key' >> beam.GroupByKey() \
>>>             | 'Print' >> beam.Map(print)
>>>
>>> Thanks,
>>> Ankur

Reply via email to