In state & timers and new DoFn in the past It was an explicit decision to not allow direct observation of the watermark, but only to set a timer in event time. Is there a design doc I can read to catch up?
Kenn On Tue, Apr 9, 2019 at 1:44 PM Lukasz Cwik <lc...@google.com> wrote: > WatermarkReporterParam is about reporting the watermark. The main usecase > is for SplittableDoFns to be able to report the data watermark. > > The watermark is per input and output of a DoFn. Also each bundle being > processed has its local watermarks while the runner computes the global > watermark. The runners watermark could be per key, or key range or global > across all keys. > > There is no runner agnostic way to read the watermark today. Is there a > usecase you are targeting that would help from having access to the > watermark (also, which watermark?)? > > > On Tue, Apr 9, 2019 at 1:28 PM Pablo Estrada <pabl...@google.com> wrote: > >> I am experimenting with state / timers in Python. As I look at the >> DoFnProcessParams[1], I see that it's possible for a DoFn to receive >> several arguments (e.g. Timers, Side Inputs, etc). Also the Watermark via >> WatermarkReporterParam. >> >> I see that this parameter is not handled by runners when filling up the >> arguments for a DoFn[2][3]. So, as far as I can tell, DoFns are not >> currently able to get the watermark. >> >> Is this a bug, or is it intentional? Perhaps there's another way to find >> out the watermark for a DoFn? >> >> Best >> -P. >> >> [1] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L381-L390 >> >> [2] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L477-L488 >> [3] >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L605-L620 >> >