[ 
https://issues.apache.org/jira/browse/FLINK-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642335#comment-16642335
 ] 

ASF GitHub Bot commented on FLINK-10135:
----------------------------------------

zentol commented on a change in pull request #6702: [FLINK-10135] The 
JobManager does not report the cluster-level metrics
URL: https://github.com/apache/flink/pull/6702#discussion_r223466964
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java
 ##########
 @@ -909,6 +926,70 @@ private void clearDispatcherState() {
                terminateJobManagerRunners();
        }
 
+       private void instantiateJobManagerOverviewMetrics(MetricGroup 
jobManagerMetricGroup) {
+               long defaultTimeoutValue = 
configuration.getLong(RestOptions.CONNECTION_TIMEOUT);
+               Time defaultRequestTimeout = 
Time.milliseconds(defaultTimeoutValue);
+
+               jobManagerMetricGroup.gauge("taskSlotsAvailable", () -> {
+                       try {
+                               return (long) resourceManagerGateway
+                                       
.requestResourceOverview(defaultRequestTimeout)
+                                       .get()
+                                       .getNumberFreeSlots();
+                       } catch (InterruptedException e) {
+                               Thread.currentThread().interrupt();
+                       } catch (ExecutionException e) {
+                               log.error("Request resource overview occurs 
exception.", e);
+                       }
+
+                       return 0L;
+               });
+
+               jobManagerMetricGroup.gauge("taskSlotsTotal", () -> {
+                       try {
+                               return (long) resourceManagerGateway
+                                       
.requestResourceOverview(defaultRequestTimeout)
+                                       .get()
+                                       .getNumberRegisteredSlots();
+                       } catch (InterruptedException e) {
+                               Thread.currentThread().interrupt();
+                       } catch (ExecutionException e) {
+                               log.error("Request resource overview occurs 
exception.", e);
+                       }
+
+                       return 0L;
+               });
+
+               jobManagerMetricGroup.gauge("numRegisteredTaskManagers", () -> {
+                       try {
+                               return (long) resourceManagerGateway
+                                       
.requestResourceOverview(defaultRequestTimeout)
+                                       .get()
+                                       .getNumberTaskManagers();
+                       } catch (InterruptedException e) {
+                               Thread.currentThread().interrupt();
+                       } catch (ExecutionException e) {
+                               log.error("Request resource overview occurs 
exception.", e);
+                       }
+
+                       return 0L;
+               });
+
+               jobManagerMetricGroup.gauge("numRunningJobs", () -> {
+                       try {
+                               return (long) 
requestOverviewForAllJobs(defaultRequestTimeout)
 
 Review comment:
   This results in a wave of RPC calls to every single JM. Would it not be 
feasible to simple count the number of job IDs contained in 
`jobManagerRunnerFutures`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> The JobManager doesn't report the cluster-level metrics
> -------------------------------------------------------
>
>                 Key: FLINK-10135
>                 URL: https://issues.apache.org/jira/browse/FLINK-10135
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager, Metrics
>    Affects Versions: 1.5.0, 1.6.0, 1.7.0
>            Reporter: Joey Echeverria
>            Assignee: vinoyang
>            Priority: Critical
>              Labels: pull-request-available
>
> In [the documentation for 
> metrics|https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/metrics.html#cluster]
>  in the Flink 1.5.0 release, it says that the following metrics are reported 
> by the JobManager:
> {noformat}
> numRegisteredTaskManagers
> numRunningJobs
> taskSlotsAvailable
> taskSlotsTotal
> {noformat}
> In the job manager REST endpoint 
> ({{http://<job-manager>:8081/jobmanager/metrics}}), those metrics don't 
> appear.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to