zhuzhurk commented on a change in pull request #10082: [FLINK-14164][runtime]
Add a counter ‘numberOfRestarts’ to show number of restarts
URL: https://github.com/apache/flink/pull/10082#discussion_r343562150
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java
##########
@@ -193,6 +197,11 @@ public SchedulerBase(
this.failoverTopology = executionGraph.getFailoverTopology();
this.inputsLocationsRetriever = new
ExecutionGraphToInputsLocationsRetrieverAdapter(executionGraph);
+
+ // Use the counter from execution graph to avoid modifying
execution graph interfaces
+ // Can be a new SimpleCounter created here after the legacy
scheduler is removed.
+ this.numberOfRestartsCounter =
executionGraph.getNumberOfRestartsCounter();
+ jobManagerJobMetricGroup.meter(NUMBER_OF_RESTARTS, new
MeterView(numberOfRestartsCounter));
Review comment:
Thanks @zentol for the detailed explanation!
I agree that expose such an low frequency and inaccurate meter is not good.
But I'm not sure whether it's good to introduce HourMeter since, as you
said, it violates the meter interface. And there can also be requirement for
MinuteMeter if we take it this way.
Maybe we should revive the
[discussion](https://lists.apache.org/thread.html/6ed95eb6a91168dba09901e158bc1b6f4b08f1e176db4641f79de765@%3Cdev.flink.apache.org%3E).
The previous conclusion was that a meter should be added. And we need to
updates our latest thoughts to see how we can reach a consensus.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services