[
https://issues.apache.org/jira/browse/BEAM-9065?focusedWorklogId=376253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-376253
]
ASF GitHub Bot logged work on BEAM-9065:
----------------------------------------
Author: ASF GitHub Bot
Created on: 23/Jan/20 13:57
Start Date: 23/Jan/20 13:57
Worklog Time Spent: 10m
Work Description: echauchot commented on issue #10530: [BEAM-9065] Reset
MetricsContainerStepMapAccumulator upon initialization of MetricsAccumulator
singleton
URL: https://github.com/apache/beam/pull/10530#issuecomment-577692249
@iemejia I did not manage to reproduce the failing behavior outside of a
spark cluster. Indeed, the test below passes even without the fix:
` public void testMetricsAreResetBetweenRuns() {
PipelineResult result1 = runPipelineWithMetrics(1);
PipelineResult result2 = runPipelineWithMetrics(2);
MetricQueryResults metrics1 = queryTestMetrics(result1);
MetricQueryResults metrics2 = queryTestMetrics(result2);
assertCounterMetrics(metrics1, false);
assertCounterMetrics(metrics2, false);
}`
But the fix was tested inside a spark cluster and fixes the failing
behavior. ValidatesRunner tests pass for both spark and spark structured
streaming runners and their coverage seems correct: it includes user tests of
all metrics types + low level tests (such as cell manipulation)
Can we merge this PR ?
Also: I have removed last commit that adds the above test because it passes
even without the fix
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 376253)
Time Spent: 1h (was: 50m)
> Spark runner accumulates metrics (incorrectly) between runs
> -----------------------------------------------------------
>
> Key: BEAM-9065
> URL: https://issues.apache.org/jira/browse/BEAM-9065
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Reporter: Etienne Chauchot
> Assignee: Etienne Chauchot
> Priority: Major
> Fix For: 2.20.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When pipeline.run() is called, MetricsAccumulator (wrapper of
> MetricsContainerStepMap spark accumulator) is initialized. Spark needs this
> class to be a singleton for failover. The problem is that when several
> pipelines are run inside the same JVM, the initialization ofÂ
> MetricsAccumulator singleton does not reset the underlying spark accumulatorÂ
> causing metrics to be accumulated between runs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)