See this doc[1] and blog[2] for some context about SplittableDoFns. To support watermark reporting within the Java SDK for SplittableDoFns, we need a way to have SDF authors to report watermark estimates over the element and restriction pair that they are processing.
For UnboundedSources, it was found to be a pain point to ask each SDF author to write their own watermark estimation which typically prevented re-use. Therefore we would like to have a "library" of watermark estimators that help SDF authors perform this estimation similar to how there is a "library" of restrictions and restriction trackers that SDF authors can use. For SDF authors where the existing library doesn't work, they can add additional ones that observe timestamps of elements or choose to directly report the watermark through a "ManualWatermarkEstimator" parameter that can be supplied to @ProcessElement methods. The public facing portion of the DoFn changes adds three new annotations for new DoFn style methods: GetInitialWatermarkEstimatorState: Returns the initial watermark state, similar to GetInitialRestriction GetWatermarkEstimatorStateCoder: Returns a coder compatible with watermark state type, similar to GetRestrictionCoder for restrictions returned by GetInitialRestriction. NewWatermarkEstimator: Returns a watermark estimator that either the framework invokes allowing it to observe the timestamps of output records or a manual watermark estimator that can be explicitly invoked to update the watermark. See [3] for an initial PR with the public facing additions to the core Java API related to SplittableDoFn. This mirrors a bunch of work that was done by Boyuan within the Pyhon SDK [4, 5] but in the style of new DoFn parameter/method invocation we have in the Java SDK. 1: https://s.apache.org/splittable-do-fn 2: https://beam.apache.org/blog/2017/08/16/splittable-do-fn.html 3: https://github.com/apache/beam/pull/10992 4: https://github.com/apache/beam/pull/9794 5: https://github.com/apache/beam/pull/10375
