Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724776861


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   Got it @chenyulin0719 , remove those unused code in latest PR!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


chenyulin0719 commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724762707


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   I mean removing the below functions: 
   - DecQueueApplicationsRejected()
   - DecQueueApplicationsFailed()
   - DecQueueApplicationsCompleted()
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


chenyulin0719 commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724762707


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   I mean remove below functions: 
   - DecQueueApplicationsRejected()
   - DecQueueApplicationsFailed()
   - DecQueueApplicationsCompleted()
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


chenyulin0719 commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724753735


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   @wilfred-s Understood, thanks for the explanation.
   
   @zhuqi-lucas Not tracking expired app metrics make sense to me. 
   To avoid any confusion, I think we should remove all the decrement function 
for Rejected/Failed/Completed.  WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


chenyulin0719 commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724753735


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   @wilfred-s Understood, thanks for the explanation.
   @zhuqi-lucas Not tracking expired metrics make sense to me. 
   To avoid any confusion, I think we should remove all the decrement function 
for Rejected/Failed/Completed.  WDYT?



##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   @wilfred-s Understood, thanks for the explanation.
   
   @zhuqi-lucas Not tracking expired metrics make sense to me. 
   To avoid any confusion, I think we should remove all the decrement function 
for Rejected/Failed/Completed.  WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724635665


##
pkg/metrics/queue.go:
##
@@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() 
(int, error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsResuming() {
+   m.incQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsResuming() {
+   m.decQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *QueueMetrics) IncQueueApplicationsFailing() {
+   m.incQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsFailing() {
+   m.decQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsFailed() {
m.incQueueApplications(AppFailed)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsFailed() {
+   m.decQueueApplications(AppFailed)
+}

Review Comment:
   Addressed in latest PR.



##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   Addressed in latest PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724635324


##
pkg/metrics/scheduler.go:
##
@@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() 
(int, error) {
return -1, err
 }
 
+func (m *SchedulerMetrics) IncTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Dec()
+}
+
 func (m *SchedulerMetrics) IncTotalApplicationsFailed() {
m.application.WithLabelValues(AppFailed).Inc()
 }
 
-func (m *SchedulerMetrics) IncTotalApplicationsCompleted() {
-   m.application.WithLabelValues(AppCompleted).Inc()
+func (m *SchedulerMetrics) IncTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Dec()
+}
+
+func (m *SchedulerMetrics) IncTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Dec()
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) {
-   m.application.WithLabelValues(AppCompleted).Add(float64(value))
+func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.application.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailed() {
+   m.application.WithLabelValues(AppFailed).Dec()
+}

Review Comment:
   Addressed in latest PR.



##
pkg/metrics/queue.go:
##
@@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, 
error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsCompleting() {
+   m.incQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsCompleting() {
+   m.decQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsCompleted() {
m.incQueueApplications(AppCompleted)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsCompleted() {
+   m.decQueueApplications(AppCompleted)
+}

Review Comment:
   Addressed in latest PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724579852


##
pkg/metrics/scheduler.go:
##
@@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() 
(int, error) {
return -1, err
 }
 
+func (m *SchedulerMetrics) IncTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Dec()
+}
+
 func (m *SchedulerMetrics) IncTotalApplicationsFailed() {
m.application.WithLabelValues(AppFailed).Inc()
 }
 
-func (m *SchedulerMetrics) IncTotalApplicationsCompleted() {
-   m.application.WithLabelValues(AppCompleted).Inc()
+func (m *SchedulerMetrics) IncTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Dec()
+}
+
+func (m *SchedulerMetrics) IncTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Dec()
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) {
-   m.application.WithLabelValues(AppCompleted).Add(float64(value))
+func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.application.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailed() {
+   m.application.WithLabelValues(AppFailed).Dec()
+}

Review Comment:
   I will remove DecTotalApplicationsFailed, it was not used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724579852


##
pkg/metrics/scheduler.go:
##
@@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() 
(int, error) {
return -1, err
 }
 
+func (m *SchedulerMetrics) IncTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Dec()
+}
+
 func (m *SchedulerMetrics) IncTotalApplicationsFailed() {
m.application.WithLabelValues(AppFailed).Inc()
 }
 
-func (m *SchedulerMetrics) IncTotalApplicationsCompleted() {
-   m.application.WithLabelValues(AppCompleted).Inc()
+func (m *SchedulerMetrics) IncTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Dec()
+}
+
+func (m *SchedulerMetrics) IncTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Dec()
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) {
-   m.application.WithLabelValues(AppCompleted).Add(float64(value))
+func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.application.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailed() {
+   m.application.WithLabelValues(AppFailed).Dec()
+}

Review Comment:
   Removed DecTotalApplicationsFailed, it was not used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r172453


##
pkg/metrics/queue.go:
##
@@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() 
(int, error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsResuming() {
+   m.incQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsResuming() {
+   m.decQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *QueueMetrics) IncQueueApplicationsFailing() {
+   m.incQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsFailing() {
+   m.decQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsFailed() {
m.incQueueApplications(AppFailed)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsFailed() {
+   m.decQueueApplications(AppFailed)
+}

Review Comment:
   Same as above.



##
pkg/metrics/queue.go:
##
@@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, 
error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsCompleting() {
+   m.incQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsCompleting() {
+   m.decQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsCompleted() {
m.incQueueApplications(AppCompleted)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsCompleted() {
+   m.decQueueApplications(AppCompleted)
+}

Review Comment:
   Same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724565197


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   @chenyulin0719 @wilfred-s 
   If i make sense right, we don't need to add decrease for following state? 
Because they are final state besides the expired:
   1. Rejected
   2. Failed
   3. Completed
   
   And for expired, i don't think we need to tracking it?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2301349163

   > The queue metrices description is not updated:
   > 
   > * 
https://github.com/zhuqi-lucas/yunikorn-core/blob/0d21b97470d3ca748e54ad075231d85d038f5dcc/pkg/metrics/queue.go#L73
   > * 
https://github.com/zhuqi-lucas/yunikorn-core/blob/0d21b97470d3ca748e54ad075231d85d038f5dcc/pkg/metrics/queue.go#L81
   > 
   > https://private-user-images.githubusercontent.com/26764036/359786193-1df5bcbf-87cf-40e3-8897-3a6dc7d82db3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQyMjU3NjEsIm5iZiI6MTcyNDIyNTQ2MSwicGF0aCI6Ii8yNjc2NDAzNi8zNTk3ODYxOTMtMWRmNWJjYmYtODdjZi00MGUzLTg4OTctM2E2ZGM3ZDgyZGIzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODIxVDA3MzEwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTcxZjY1MjgwMWE2OTFlM2Y3NWIxN2JlMzE1MzlkMGE2YjVjMGUxNzViMGYwZDY2NTVjZTMwNmMwOWJlN2VkMjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.82Le8Od5sIB-KXoRhBGGGMUDD_TLIk1iW3SJYkldTq4";>
   
   Good catch, thanks @chenyulin0719 for review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724565197


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   @chenyulin0719 @wilfred-s 
   If i make sense right, we don't need to add decrease for following state?
   1. Rejected
   2. Failed
   3. Completed
   
   And for expired, i don't think we need to tracking it?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724561831


##
pkg/metrics/scheduler.go:
##
@@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics {
Help:  "Total number of attempts to allocate 
containers. State of the attempt includes `allocated`, `rejected`, `error`, 
`released`",
}, []string{"state"})
 
-   s.applicationSubmission = prometheus.NewCounterVec(
-   prometheus.CounterOpts{
+   s.applicationSubmission = prometheus.NewGaugeVec(

Review Comment:
   Thanks @wilfred-s for clarify for this field,it makes sense to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


wilfred-s commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724541474


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   Rejected applications should only increase, never decrease: a simple counter.
   It is a final state that the application only gets removed from as part of 
the cleanup. We keep it around for a while to make it traceable for the 
submitter. i.e. submit an application that gets rejected: find it is the list 
as such. If we would drop it immediately the submitter has no idea what 
happened.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-21 Thread via GitHub


wilfred-s commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724534428


##
pkg/metrics/scheduler.go:
##
@@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics {
Help:  "Total number of attempts to allocate 
containers. State of the attempt includes `allocated`, `rejected`, `error`, 
`released`",
}, []string{"state"})
 
-   s.applicationSubmission = prometheus.NewCounterVec(
-   prometheus.CounterOpts{
+   s.applicationSubmission = prometheus.NewGaugeVec(

Review Comment:
   A submission can not be changed. When an application is submitted it will 
always be submitted. It should be a counter that keeps increasing. It is not a 
state that we track it is a pure counter from start to shutdown.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


chenyulin0719 commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724471529


##
pkg/metrics/queue.go:
##
@@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() {
m.incQueueApplications(AppRejected)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsRejected() {
+   m.decQueueApplications(AppRejected)
+}

Review Comment:
   DecQueueApplicationsRejected() is not called in "leave_Rejected" state? Is 
it intentional?



##
pkg/metrics/queue.go:
##
@@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, 
error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsCompleting() {
+   m.incQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsCompleting() {
+   m.decQueueApplications(AppCompleting)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsCompleted() {
m.incQueueApplications(AppCompleted)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsCompleted() {
+   m.decQueueApplications(AppCompleted)
+}

Review Comment:
   DecQueueApplicationsCompleted() is not called in "leave_completed" state too.



##
pkg/metrics/scheduler.go:
##
@@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() 
(int, error) {
return -1, err
 }
 
+func (m *SchedulerMetrics) IncTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailing() {
+   m.application.WithLabelValues(AppFailing).Dec()
+}
+
 func (m *SchedulerMetrics) IncTotalApplicationsFailed() {
m.application.WithLabelValues(AppFailed).Inc()
 }
 
-func (m *SchedulerMetrics) IncTotalApplicationsCompleted() {
-   m.application.WithLabelValues(AppCompleted).Inc()
+func (m *SchedulerMetrics) IncTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsCompleting() {
+   m.application.WithLabelValues(AppCompleting).Dec()
+}
+
+func (m *SchedulerMetrics) IncTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsResuming() {
+   m.application.WithLabelValues(AppResuming).Dec()
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) {
-   m.application.WithLabelValues(AppCompleted).Add(float64(value))
+func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.application.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsFailed() {
+   m.application.WithLabelValues(AppFailed).Dec()
+}

Review Comment:
   nit: Could we move DecTotalApplicationsFailed() below 
IncTotalApplicationsFailed()?



##
pkg/metrics/queue.go:
##
@@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() 
(int, error) {
return -1, err
 }
 
+func (m *QueueMetrics) IncQueueApplicationsResuming() {
+   m.incQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsResuming() {
+   m.decQueueApplications(AppResuming)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
+func (m *QueueMetrics) IncQueueApplicationsFailing() {
+   m.incQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) DecQueueApplicationsFailing() {
+   m.decQueueApplications(AppFailing)
+}
+
+func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *QueueMetrics) IncQueueApplicationsFailed() {
m.incQueueApplications(AppFailed)
 }
 
+func (m *QueueMetrics) DecQueueApplicationsFailed() {
+   m.decQueueApplications(AppFailed)
+}

Review Comment:
   DecQueueApplicationsFailed() is not called in "leave_Failed" state too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubsc

Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


codecov[bot] commented on PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2300433659

   ## 
[Codecov](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   Attention: Patch coverage is `33.77483%` with `100 lines` in your changes 
missing coverage. Please review.
   > Project coverage is 79.06%. Comparing base 
[(`9f46b81`)](https://app.codecov.io/gh/apache/yunikorn-core/commit/9f46b81698a43798e87f9edaac37631b9f592a1e?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`e413ba6`)](https://app.codecov.io/gh/apache/yunikorn-core/commit/e413ba6a7ecc1055cb69b77eeaf663075855218f?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 1 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[pkg/metrics/queue.go](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&filepath=pkg%2Fmetrics%2Fqueue.go&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGtnL21ldHJpY3MvcXVldWUuZ28=)
 | 0.00% | [56 Missing :warning: 
](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   | 
[pkg/metrics/scheduler.go](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&filepath=pkg%2Fmetrics%2Fscheduler.go&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGtnL21ldHJpY3Mvc2NoZWR1bGVyLmdv)
 | 4.34% | [44 Missing :warning: 
](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@Coverage Diff @@
   ##   master #951  +/-   ##
   ==
   - Coverage   79.56%   79.06%   -0.50% 
   ==
 Files  97   97  
 Lines   1227512393 +118 
   ==
   + Hits 9767 9799  +32 
   - Misses   2231 2316  +85 
   - Partials  277  278   +1 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


zhuqi-lucas commented on PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2300374500

   > Please run `make lint` before submitting PRs.
   
   Fix go lint now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


craigcondit commented on PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2299389949

   Please run `make lint` before submitting PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722813874


##
pkg/metrics/scheduler.go:
##
@@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics {
Help:  "Total number of attempts to allocate 
containers. State of the attempt includes `allocated`, `rejected`, `error`, 
`released`",
}, []string{"state"})
 
-   s.applicationSubmission = prometheus.NewCounterVec(
-   prometheus.CounterOpts{
+   s.applicationSubmission = prometheus.NewGaugeVec(

Review Comment:
   Change to gauge because, we want to support dec also



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722812390


##
pkg/metrics/scheduler.go:
##
@@ -274,10 +296,6 @@ func (m *SchedulerMetrics) DecTotalApplicationsRunning() {
m.application.WithLabelValues(AppRunning).Dec()
 }
 
-func (m *SchedulerMetrics) SubTotalApplicationsRunning(value int) {
-   m.application.WithLabelValues(AppRunning).Sub(float64(value))

Review Comment:
   Same with above, remove this duplicated logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]

2024-08-20 Thread via GitHub


zhuqi-lucas commented on code in PR #951:
URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722811773


##
pkg/metrics/scheduler.go:
##
@@ -241,27 +241,49 @@ func (m *SchedulerMetrics) GetSchedulingErrors() (int, 
error) {
return -1, err
 }
 
+func (m *SchedulerMetrics) IncTotalApplicationsNew() {
+   m.applicationSubmission.WithLabelValues(AppNew).Inc()
+}
+
+func (m *SchedulerMetrics) DecTotalApplicationsNew() {
+   m.applicationSubmission.WithLabelValues(AppNew).Dec()
+}
+
+func (m *SchedulerMetrics) GetTotalApplicationsNew() (int, error) {
+   metricDto := &dto.Metric{}
+   err := m.applicationSubmission.WithLabelValues(AppNew).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
+}
+
 func (m *SchedulerMetrics) IncTotalApplicationsAccepted() {
m.applicationSubmission.WithLabelValues(AppAccepted).Inc()
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsAccepted(value int) {
-   m.applicationSubmission.WithLabelValues(AppAccepted).Add(float64(value))
+func (m *SchedulerMetrics) DecTotalApplicationsAccepted() {
+   m.applicationSubmission.WithLabelValues(AppAccepted).Dec()
 }
 
-func (m *SchedulerMetrics) IncTotalApplicationsRejected() {
-   m.applicationSubmission.WithLabelValues(ContainerRejected).Inc()
+func (m *SchedulerMetrics) GetTotalApplicationsAccepted() (int, error) {
+   metricDto := &dto.Metric{}
+   err := 
m.applicationSubmission.WithLabelValues(AppAccepted).Write(metricDto)
+   if err == nil {
+   return int(*metricDto.Gauge.Value), nil
+   }
+   return -1, err
 }
 
-func (m *SchedulerMetrics) AddTotalApplicationsRejected(value int) {
-   
m.applicationSubmission.WithLabelValues(ContainerRejected).Add(float64(value))

Review Comment:
   The name was wrong, it should be applicationReject, and also remove this 
logic because it will be all handled in state machine changes, it will be 
duplicated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org