[GitHub] [incubator-yunikorn-core] chenya-zhang commented on a change in pull request #352: [YUNIKORN-721]Improve YuniKorn core's queue-level and scheduler metrics

GitBox Fri, 24 Dec 2021 09:21:30 -0800


chenya-zhang commented on a change in pull request #352:
URL: 
https://github.com/apache/incubator-yunikorn-core/pull/352#discussion_r775057744




##########
File path: pkg/metrics/queue.go
##########
@@ -28,60 +28,38 @@ import (
        "github.com/apache/incubator-yunikorn-core/pkg/log"
 )
 
+// QueueMetrics to declare queue metrics
 type QueueMetrics struct {
-       // metrics related to app
-       appMetrics *prometheus.CounterVec
-
-       // metrics related to resource
-       usedResourceMetrics      *prometheus.GaugeVec
-       pendingResourceMetrics   *prometheus.GaugeVec
-       availableResourceMetrics *prometheus.GaugeVec
+       appMetrics      *prometheus.GaugeVec
+       ResourceMetrics *prometheus.GaugeVec
 }
 
-func forQueue(name string) CoreQueueMetrics {
+// InitQueueMetrics to initialize queue metrics
+func InitQueueMetrics(name string) CoreQueueMetrics {
        q := &QueueMetrics{}
 
-       // Queue Metrics
-       q.appMetrics = prometheus.NewCounterVec(
-               prometheus.CounterOpts{
-                       Namespace: Namespace,
-                       Subsystem: substituteQueueName(name),
-                       Name:      "app_metrics",
-                       Help:      "Application Metrics",
-               }, []string{"state"})
-
-       q.usedResourceMetrics = prometheus.NewGaugeVec(
-               prometheus.GaugeOpts{
-                       Namespace: Namespace,
-                       Subsystem: substituteQueueName(name),
-                       Name:      "used_resource",
-                       Help:      "Queue used resource",
-               }, []string{"resource"})
-
-       q.pendingResourceMetrics = prometheus.NewGaugeVec(
+       q.appMetrics = prometheus.NewGaugeVec(
                prometheus.GaugeOpts{
                        Namespace: Namespace,
                        Subsystem: substituteQueueName(name),
-                       Name:      "pending_resource",
-                       Help:      "Queue pending resource",
-               }, []string{"resource"})
+                       Name:      "queue_app",
+                       Help:      "Queue application metrics. State of the 
application includes `running`.",
+               }, []string{"state"})

Review comment:
       Similar to the above, I think it is more meaningful to count the current 
running apps in a queue not "all the apps that have 
run/accepted/rejected/completed in a queue unless the scheduler restarts". Not 
very sure what the business value is.
   
   From the scheduler/cluster level, we have the metrics 
`applicationSubmission` and `application` to count apps that have 
run/accepted/rejected/completed. I think it should satisfy our operational 
needs since all queues share the same scheduler in a cluster.
   
   One thing to note is that, on the queue level, these metrics are never 
implemented so it is not a backward incompatible change. I will help to 
implement them in future PRs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-yunikorn-core] chenya-zhang commented on a change in pull request #352: [YUNIKORN-721]Improve YuniKorn core's queue-level and scheduler metrics

Reply via email to