[ 
https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864666#comment-15864666
 ] 

Thomas Groh commented on BEAM-1461:
-----------------------------------

Sure;

{{prepareForProcessing}} exists purely to support Aggregators - it sets up some 
finalization details within the base DoFn class. However, we plan on removing 
Aggregators, tracked in [BEAM-775]. Once that's done, we should remove the 
parts of the SDK that exist to support Aggregators, which includes 
{{prepareForProcessing}}. We should signal this now, and I believe making the 
method final and deprecated is the most effective way to signal that it's an 
implementation detail of the DoFn internals rather than the user-visible 
processing method.

Additionally, overriding {{prepareForProcessing}} could lead to a lack of 
precondition enforcement within DoFn, so it is not generally safe to override.

On an additional note, we probably would have missed the naming duplication in 
prepareForProcessing as well as the fact that it's actually an Aggregator 
method without this Jira, so thank you for posting it.

> duplication with StartBundle and prepareForProcessing in DoFn
> -------------------------------------------------------------
>
>                 Key: BEAM-1461
>                 URL: https://issues.apache.org/jira/browse/BEAM-1461
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Xu Mingmin
>            Assignee: Davor Bonaci
>
> There're one annotation `StartBundle`, and one public function 
> `prepareForProcessing` in DoFn, which are called both before 
> `ProcessElement`. It's confused which one should be implemented in a subclass.
> The call sequence seems as:
> prepareForProcessing -> StartBundle -> processElement



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to