[ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=167868&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-167868
 ]

ASF GitHub Bot logged work on BEAM-2939:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Nov/18 17:30
            Start Date: 20/Nov/18 17:30
    Worklog Time Spent: 10m 
      Work Description: swegner commented on a change in pull request #6969: 
[BEAM-2939] SplittableDoFn Java SDK API Changes
URL: https://github.com/apache/beam/pull/6969#discussion_r235097184
 
 

 ##########
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##########
 @@ -838,4 +884,50 @@ public final void prepareForProcessing() {}
    */
   @Override
   public void populateDisplayData(DisplayData.Builder builder) {}
+
+  /**
+   * A parameter that is accessible during {@link StartBundle @StartBundle}, 
{@link
+   * ProcessElement @ProcessElement} and {@link FinishBundle @FinishBundle} 
that allows the caller
+   * to register a callback that will be invoked after the bundle has been 
successfully completed
+   * and the runner has commit the output.
+   *
+   * <p>A common usage would be to perform any acknowledgements required by an 
external system such
+   * as acking messages from a message queue since this callback is only 
invoked after the output of
+   * the bundle has been durably persisted by the runner.
+   *
+   * <p>Note that a runner may make the output of the bundle available 
immediately to downstream
+   * consumers without waiting for finalization to succeed. For pipelines that 
are sensitive to
+   * duplicate messages, they must perform output deduplication in the 
pipeline.
+   */
+  // TODO: Add support for a deduplication PTransform.
+  @Experimental(Kind.SPLITTABLE_DO_FN)
+  public interface BundleFinalizer {
+    /**
+     * The provided function will be called after the runner successfully 
commits the output of a
+     * successful bundle. Throwing during finalization represents that bundle 
finalization may have
+     * failed and the runner may choose to attempt finalization again. The 
provided duration
+     * controls how long the finalization is valid for before it is garbage 
collected and no longer
+     * able to be invoked.
+     *
+     * <p>Note that finalization is best effort and it is expected that the 
external system will
+     * self recover state if finalization never happens or consistently fails. 
For example, a queue
+     * based system that requires message acknowledgement would replay 
messages if that
+     * acknowledgement was never received within the provided time bound.
+     *
+     * <p>See <a href="https://s.apache.org/beam-finalizing-bundles";>Apache 
Beam Portability API:
+     * How to Finalize Bundles</a> for further details.
+     */
+    void afterBundleCommit(Duration finalizationDelayLimit, Callback callback);
+
+    /**
+     * An instance of a function that will be invoked after bundle 
finalization.
+     *
+     * <p>Note that this function should maintain all state necessary outside 
of a DoFn's context to
+     * be able to perform bundle finalization and should not rely on mutable 
state stored within a
+     * DoFn instance.
+     */
+    interface Callback {
 
 Review comment:
   Is there sufficient value for defining our own `Callback` interface rather 
than reusing Java's `Callable`? I see some pros/cons:
   
   Pros:
   
   - Can more explicitly specify contract / constraints. In this case, 
documenting that the callback should not rely on external state.
   - More control over method signature. We're not using it now, but we could 
specify a specialized `Exception` subclass.
   
   Cons:
   
   - Harder to reuse existing `Callable` instances
   - Cannot take advantage of ecosystem of existing utility methods / helpers 
for working with `Callable`s
   - More code, more maintenance

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 167868)
    Time Spent: 8h 20m  (was: 8h 10m)

> Fn API streaming SDF support
> ----------------------------
>
>                 Key: BEAM-2939
>                 URL: https://issues.apache.org/jira/browse/BEAM-2939
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model
>            Reporter: Henning Rohde
>            Assignee: Luke Cwik
>            Priority: Major
>              Labels: portability
>          Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to