Thanks for the clarification, Ben! That makes a lot of sense to me. On Thu, Jun 15, 2017 at 2:50 PM, Ben Chambers <[email protected]> wrote:
> Your understanding seems roughly correct. When the watermark is talked > about as a timestamp or "one dimensional" concept it is because we're > implicitly talking about the watermark *at the current processing time*. As > the current processing time moves forward, the value of the watermark > changes too. > > There is also a requirement that the watermark only moves forward. > > On Thu, Jun 15, 2017 at 2:39 PM Haibo Chen <[email protected]> wrote: > >> Hi all, >> >> While I was going over The Beam Model [model evolution] >> <https://docs.google.com/presentation/d/1SHie3nwe-pqmjGum_QDznPr-B_zXCjJ2VBDGdafZme8/edit#slide=id.g12846a6162_0_5> >> to >> learn the basics of the model, I found the explanation of watermark (in >> slide 27), "No timestamp earlier than the watermark will be seen" and "It >> declares that no event times earlier than this point are expected to appear >> in the future", hard to understand. >> >> From the graph in the slide, the watermark seems to be a two-dimensional >> concept, whereas timestamp (regardless of event or processing time) is >> one-dimensional. Hence, my confusion around the explanation. It seems to me >> that we can only talk about watermark in the context of processing time. >> My own interpretation of water ,based on the graph, is >> >> Given a point (e, p)on the the watermark curve, at processing time p, >> the system is confident (since watermark is just heuristics) that no events >> happened earlier than e are expected to be seen. >> >> Is the understanding roughly correct? I plan to read the Dataflow paper >> to get a more precise understanding, but would also like to hear >> explanations in a less formal terms. Any help is greatly appreciated. >> >> Best, >> Haibo Chen >> >
