[
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=167827&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-167827
]
ASF GitHub Bot logged work on BEAM-2939:
----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Nov/18 16:43
Start Date: 20/Nov/18 16:43
Worklog Time Spent: 10m
Work Description: lukecwik commented on a change in pull request #6969:
[BEAM-2939] SplittableDoFn Java SDK API Changes
URL: https://github.com/apache/beam/pull/6969#discussion_r234821917
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
##########
@@ -838,4 +872,40 @@ public final void prepareForProcessing() {}
*/
@Override
public void populateDisplayData(DisplayData.Builder builder) {}
+
+ /**
+ * A parameter that is accessible during {@link StartBundle @StartBundle},
{@link
Review comment:
1. Finalization can only be invoked during the lifetime of a bundle, DoFn
setup/teardown are across bundle boundaries.
2. All callbacks will be registered. Note that the implementation just
stores these in memory and the SDK can only request from the Runner to finalize
the bundle which is a simple single field that is set on the proto.
3. I specifically choose a function registration scheme because I didn't
want the person to have access to the DoFn's state so that the DoFn instance
could be re-used instead of re-executed. Also, having a `@FinalizeBundle`
method doesn't allow the user to optionally register a callback and this would
force every bundle to have finalization. Note that this is just something I'm
proposing and could be changed if we feel that:
* A DoFn instance should not be invoked for another bundle until after
bundle finalization success or finalization is garbage collected.
* Having finalization is likely always going to be needed or never going to
be needed. The scenario I'm thinking of is when you poll an external system and
there is no data to process so there is no need for finalization.
Filled out more of the javadoc.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 167827)
Time Spent: 7h 40m (was: 7.5h)
> Fn API streaming SDF support
> ----------------------------
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
> Issue Type: Improvement
> Components: beam-model
> Reporter: Henning Rohde
> Assignee: Luke Cwik
> Priority: Major
> Labels: portability
> Time Spent: 7h 40m
> Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)