Thanks for the clarification, Ben! That makes a lot of sense to me.

On Thu, Jun 15, 2017 at 2:50 PM, Ben Chambers <[email protected]> wrote:

> Your understanding seems roughly correct. When the watermark is talked
> about as a timestamp or "one dimensional" concept it is because we're
> implicitly talking about the watermark *at the current processing time*. As
> the current processing time moves forward, the value of the watermark
> changes too.
>
> There is also a requirement that the watermark only moves forward.
>
> On Thu, Jun 15, 2017 at 2:39 PM Haibo Chen <[email protected]> wrote:
>
>> Hi all,
>>
>> While I was going over The Beam Model [model evolution]
>> <https://docs.google.com/presentation/d/1SHie3nwe-pqmjGum_QDznPr-B_zXCjJ2VBDGdafZme8/edit#slide=id.g12846a6162_0_5>
>>  to
>> learn the basics of the model, I found the explanation of watermark (in
>> slide 27), "No timestamp earlier than the watermark will be seen" and "It
>> declares that no event times earlier than this point are expected to appear
>> in the future", hard to understand.
>>
>> From the graph in the slide, the watermark seems to be a two-dimensional
>> concept, whereas timestamp (regardless of event or processing time) is
>> one-dimensional. Hence, my confusion around the explanation. It seems to me
>> that we can only talk about watermark in the context of processing time.
>> My own interpretation of water ,based on the graph, is
>>
>> Given a point (e, p)on the the watermark curve, at processing time p,
>> the system is confident (since watermark is just heuristics) that no events
>> happened earlier than e are expected to be seen.
>>
>> Is the understanding roughly correct? I plan to read the Dataflow paper
>> to get a more precise understanding, but would also like to hear
>> explanations in a less formal terms. Any help is greatly appreciated.
>>
>> Best,
>> Haibo Chen
>>
>

Reply via email to