Great idea.

Are any of the methods optional or useful on their own? It seems like maybe
not? So then a single annotation to return an object that returns all the
methods might be more clear. Per Boyuan's work - WatermarkEstimatorProvider?

Kenn

On Thu, Feb 27, 2020 at 2:43 PM Luke Cwik <[email protected]> wrote:

> See this doc[1] and blog[2] for some context about SplittableDoFns.
>
> To support watermark reporting within the Java SDK for SplittableDoFns, we
> need a way to have SDF authors to report watermark estimates over the
> element and restriction pair that they are processing.
>
> For UnboundedSources, it was found to be a pain point to ask each SDF
> author to write their own watermark estimation which typically prevented
> re-use. Therefore we would like to have a "library" of watermark estimators
> that help SDF authors perform this estimation similar to how there is a
> "library" of restrictions and restriction trackers that SDF authors can
> use. For SDF authors where the existing library doesn't work, they can add
> additional ones that observe timestamps of elements or choose to directly
> report the watermark through a "ManualWatermarkEstimator" parameter that
> can be supplied to @ProcessElement methods.
>
> The public facing portion of the DoFn changes adds three new annotations
> for new DoFn style methods:
> GetInitialWatermarkEstimatorState: Returns the initial watermark state,
> similar to GetInitialRestriction
> GetWatermarkEstimatorStateCoder: Returns a coder compatible with watermark
> state type, similar to GetRestrictionCoder for restrictions returned by
> GetInitialRestriction.
> NewWatermarkEstimator: Returns a watermark estimator that either the
> framework invokes allowing it to observe the timestamps of output records
> or a manual watermark estimator that can be explicitly invoked to update
> the watermark.
>
> See [3] for an initial PR with the public facing additions to the core
> Java API related to SplittableDoFn.
>
> This mirrors a bunch of work that was done by Boyuan within the Pyhon SDK
> [4, 5] but in the style of new DoFn parameter/method invocation we have in
> the Java SDK.
>
> 1: https://s.apache.org/splittable-do-fn
> 2: https://beam.apache.org/blog/2017/08/16/splittable-do-fn.html
> 3: https://github.com/apache/beam/pull/10992
> 4: https://github.com/apache/beam/pull/9794
> 5: https://github.com/apache/beam/pull/10375
>

Reply via email to