[
https://issues.apache.org/jira/browse/BEAM-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197948#comment-17197948
]
Tim Clemons commented on BEAM-10481:
------------------------------------
The workaround appears to be removing the {{beam-checkpoint}} sub-directory
before each launch of the application. In the meantime, I've also submitted a
pull request that should ensure registration of the accumulators occur.
> MetricsAccumulator is not registering when resuming from a checkpoint
> ---------------------------------------------------------------------
>
> Key: BEAM-10481
> URL: https://issues.apache.org/jira/browse/BEAM-10481
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Affects Versions: 2.22.0
> Reporter: Mani Kolbe
> Priority: P2
> Attachments: image-2020-07-14-10-55-11-855.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
>
> I am running in an beam application in streaming mode with jobName and
> checkPointDir configured. When I recover application from a planned stop, I
> am getting failure.
>
> I did some investigation and noticed that the accumulator is not getting
> registered on recovering checkpoint scenario.
>
> Correct me if I am wrong. If you see the code in the screenshot below from
> MetricsAccumulator class on beam v2.22.0, you can see new instance of
> MetricsContainerStepMapAccumulator is getting registered on line#64. But if a
> recovered value is present, it constructs a new instance with the recovered
> value (Line#78). But this new accumulator instance is not getting registered.
> This is forcing Spark Driver to throw exception:
> _java.lang.UnsupportedOperationException: Accumulator must be registered
> before send to executor_
> !image-2020-07-14-10-55-11-855.png|width=706,height=351!
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)