Though it's not tied to window. You could be in the global window, so the
watermark never advances past the end of the window, yet still get late
data.

On Thu, Jan 17, 2019, 11:14 AM Jeff Klukas <jklu...@mozilla.com wrote:

> How about: "Once the watermark progresses past the end of a window, any
> further elements that arrive with a timestamp in that window are considered
> late data."
>
> On Thu, Jan 17, 2019 at 1:43 PM Rui Wang <ruw...@google.com> wrote:
>
>> Hi Community,
>>
>> In Beam programming guide [1], there is a sentence: "Data that arrives
>> with a timestamp after the watermark is considered *late data*"
>>
>> Seems like people get confused by it. For example, see Stackoverflow
>> comment [2]. Basically it makes people think that a event timestamp that is
>> bigger than watermark is considered late (due to that "after").
>>
>> Although there is a example right after this sentence to explain late
>> data, seems to me that this sentence is incomplete. The complete sentence
>> to me can be: "The watermark consistently advances from -inf to +inf. Data
>> that arrives with a timestamp after the watermark is considered late data."
>>
>> Am I understand correctly? Is there better description for the order of
>> late data and watermark? I would happy to send PR to update Beam
>> documentation.
>>
>> -Rui
>>
>> [1]: https://beam.apache.org/documentation/programming-guide/#windowing
>> [2]:
>> https://stackoverflow.com/questions/54141352/dataflow-to-process-late-and-out-of-order-data-for-batch-and-stream-messages/54188971?noredirect=1#comment95302476_54188971
>>
>>
>>

Reply via email to