mxm commented on a change in pull request #10992: [BEAM-2939] Provide user 
facing API for reporting the watermark in SplittableDoFns.
URL: https://github.com/apache/beam/pull/10992#discussion_r392275988
 
 

 ##########
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##########
 @@ -1070,6 +1091,120 @@ public Duration getAllowedTimestampSkew() {
   @Experimental(Kind.SPLITTABLE_DO_FN)
   public @interface NewTracker {}
 
+  /**
+   * Annotation for the method that maps an element and restriction to initial 
watermark estimator
+   * state for a <a 
href="https://s.apache.org/splittable-do-fn";>splittable</a> {@link DoFn}.
+   *
+   * <p>Signature: {@code WatermarkEstimatorStateT 
getInitialWatermarkState(<arguments>);}
+   *
+   * <p>This method must satisfy the following constraints:
+   *
+   * <ul>
+   *   <li>The return type {@code WatermarkEstimatorStateT} defines the 
watermark state type used
+   *       within this splittable DoFn. All other methods that use a {@link
+   *       WatermarkEstimatorState @WatermarkEstimatorState} parameter must 
use the same type that
+   *       is used here. It is suggested to use as narrow of a return type 
definition as possible
+   *       (for example prefer to use a square type over a shape type as a 
square is a type of a
+   *       shape).
+   *   <li>If one of its arguments is tagged with the {@link Element} 
annotation, then it will be
+   *       passed the current element being processed; the argument must be of 
type {@code InputT}.
+   *       Note that automatic conversion of {@link Row}s and {@link 
FieldAccess} parameters are
+   *       currently unsupported.
+   *   <li>If one of its arguments is tagged with the {@link Restriction} 
annotation, then it will
+   *       be passed the current restriction being processed; the argument 
must be of type {@code
+   *       RestrictionT}.
+   *   <li>If one of its arguments is tagged with the {@link Timestamp} 
annotation, then it will be
+   *       passed the timestamp of the current element being processed; the 
argument must be of type
+   *       {@link Instant}.
+   *   <li>If one of its arguments is a subtype of {@link BoundedWindow}, then 
it will be passed the
+   *       window of the current element. When applied by {@link ParDo} the 
subtype of {@link
+   *       BoundedWindow} must match the type of windows on the input {@link 
PCollection}. If the
+   *       window is not accessed a runner may perform additional 
optimizations.
+   *   <li>If one of its arguments is of type {@link PaneInfo}, then it will 
be passed information
+   *       about the current triggering pane.
+   *   <li>If one of the parameters is of type {@link PipelineOptions}, then 
it will be passed the
+   *       options for the current pipeline.
+   * </ul>
+   */
+  @Documented
+  @Retention(RetentionPolicy.RUNTIME)
+  @Target(ElementType.METHOD)
+  @Experimental(Kind.SPLITTABLE_DO_FN)
+  public @interface GetInitialWatermarkEstimatorState {}
 
 Review comment:
   Note that this terminology deviates from the current `getCheckpointMark()`. 
I like it though.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to