Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724776861 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: Got it @chenyulin0719 , remove those unused code in latest PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
chenyulin0719 commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724762707 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: I mean removing the below functions: - DecQueueApplicationsRejected() - DecQueueApplicationsFailed() - DecQueueApplicationsCompleted() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
chenyulin0719 commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724762707 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: I mean remove below functions: - DecQueueApplicationsRejected() - DecQueueApplicationsFailed() - DecQueueApplicationsCompleted() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
chenyulin0719 commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724753735 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: @wilfred-s Understood, thanks for the explanation. @zhuqi-lucas Not tracking expired app metrics make sense to me. To avoid any confusion, I think we should remove all the decrement function for Rejected/Failed/Completed. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
chenyulin0719 commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724753735 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: @wilfred-s Understood, thanks for the explanation. @zhuqi-lucas Not tracking expired metrics make sense to me. To avoid any confusion, I think we should remove all the decrement function for Rejected/Failed/Completed. WDYT? ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: @wilfred-s Understood, thanks for the explanation. @zhuqi-lucas Not tracking expired metrics make sense to me. To avoid any confusion, I think we should remove all the decrement function for Rejected/Failed/Completed. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724635665 ## pkg/metrics/queue.go: ## @@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsResuming() { + m.incQueueApplications(AppResuming) +} + +func (m *QueueMetrics) DecQueueApplicationsResuming() { + m.decQueueApplications(AppResuming) +} + +func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *QueueMetrics) IncQueueApplicationsFailing() { + m.incQueueApplications(AppFailing) +} + +func (m *QueueMetrics) DecQueueApplicationsFailing() { + m.decQueueApplications(AppFailing) +} + +func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsFailed() { m.incQueueApplications(AppFailed) } +func (m *QueueMetrics) DecQueueApplicationsFailed() { + m.decQueueApplications(AppFailed) +} Review Comment: Addressed in latest PR. ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: Addressed in latest PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724635324 ## pkg/metrics/scheduler.go: ## @@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() (int, error) { return -1, err } +func (m *SchedulerMetrics) IncTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Dec() +} + func (m *SchedulerMetrics) IncTotalApplicationsFailed() { m.application.WithLabelValues(AppFailed).Inc() } -func (m *SchedulerMetrics) IncTotalApplicationsCompleted() { - m.application.WithLabelValues(AppCompleted).Inc() +func (m *SchedulerMetrics) IncTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Dec() +} + +func (m *SchedulerMetrics) IncTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Dec() } -func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) { - m.application.WithLabelValues(AppCompleted).Add(float64(value)) +func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.application.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailed() { + m.application.WithLabelValues(AppFailed).Dec() +} Review Comment: Addressed in latest PR. ## pkg/metrics/queue.go: ## @@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsCompleting() { + m.incQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) DecQueueApplicationsCompleting() { + m.decQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsCompleted() { m.incQueueApplications(AppCompleted) } +func (m *QueueMetrics) DecQueueApplicationsCompleted() { + m.decQueueApplications(AppCompleted) +} Review Comment: Addressed in latest PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724579852 ## pkg/metrics/scheduler.go: ## @@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() (int, error) { return -1, err } +func (m *SchedulerMetrics) IncTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Dec() +} + func (m *SchedulerMetrics) IncTotalApplicationsFailed() { m.application.WithLabelValues(AppFailed).Inc() } -func (m *SchedulerMetrics) IncTotalApplicationsCompleted() { - m.application.WithLabelValues(AppCompleted).Inc() +func (m *SchedulerMetrics) IncTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Dec() +} + +func (m *SchedulerMetrics) IncTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Dec() } -func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) { - m.application.WithLabelValues(AppCompleted).Add(float64(value)) +func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.application.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailed() { + m.application.WithLabelValues(AppFailed).Dec() +} Review Comment: I will remove DecTotalApplicationsFailed, it was not used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724579852 ## pkg/metrics/scheduler.go: ## @@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() (int, error) { return -1, err } +func (m *SchedulerMetrics) IncTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Dec() +} + func (m *SchedulerMetrics) IncTotalApplicationsFailed() { m.application.WithLabelValues(AppFailed).Inc() } -func (m *SchedulerMetrics) IncTotalApplicationsCompleted() { - m.application.WithLabelValues(AppCompleted).Inc() +func (m *SchedulerMetrics) IncTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Dec() +} + +func (m *SchedulerMetrics) IncTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Dec() } -func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) { - m.application.WithLabelValues(AppCompleted).Add(float64(value)) +func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.application.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailed() { + m.application.WithLabelValues(AppFailed).Dec() +} Review Comment: Removed DecTotalApplicationsFailed, it was not used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r172453 ## pkg/metrics/queue.go: ## @@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsResuming() { + m.incQueueApplications(AppResuming) +} + +func (m *QueueMetrics) DecQueueApplicationsResuming() { + m.decQueueApplications(AppResuming) +} + +func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *QueueMetrics) IncQueueApplicationsFailing() { + m.incQueueApplications(AppFailing) +} + +func (m *QueueMetrics) DecQueueApplicationsFailing() { + m.decQueueApplications(AppFailing) +} + +func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsFailed() { m.incQueueApplications(AppFailed) } +func (m *QueueMetrics) DecQueueApplicationsFailed() { + m.decQueueApplications(AppFailed) +} Review Comment: Same as above. ## pkg/metrics/queue.go: ## @@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsCompleting() { + m.incQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) DecQueueApplicationsCompleting() { + m.decQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsCompleted() { m.incQueueApplications(AppCompleted) } +func (m *QueueMetrics) DecQueueApplicationsCompleted() { + m.decQueueApplications(AppCompleted) +} Review Comment: Same as above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724565197 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: @chenyulin0719 @wilfred-s If i make sense right, we don't need to add decrease for following state? Because they are final state besides the expired: 1. Rejected 2. Failed 3. Completed And for expired, i don't think we need to tracking it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2301349163 > The queue metrices description is not updated: > > * https://github.com/zhuqi-lucas/yunikorn-core/blob/0d21b97470d3ca748e54ad075231d85d038f5dcc/pkg/metrics/queue.go#L73 > * https://github.com/zhuqi-lucas/yunikorn-core/blob/0d21b97470d3ca748e54ad075231d85d038f5dcc/pkg/metrics/queue.go#L81 > > https://private-user-images.githubusercontent.com/26764036/359786193-1df5bcbf-87cf-40e3-8897-3a6dc7d82db3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjQyMjU3NjEsIm5iZiI6MTcyNDIyNTQ2MSwicGF0aCI6Ii8yNjc2NDAzNi8zNTk3ODYxOTMtMWRmNWJjYmYtODdjZi00MGUzLTg4OTctM2E2ZGM3ZDgyZGIzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MjElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODIxVDA3MzEwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTcxZjY1MjgwMWE2OTFlM2Y3NWIxN2JlMzE1MzlkMGE2YjVjMGUxNzViMGYwZDY2NTVjZTMwNmMwOWJlN2VkMjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.82Le8Od5sIB-KXoRhBGGGMUDD_TLIk1iW3SJYkldTq4";> Good catch, thanks @chenyulin0719 for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724565197 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: @chenyulin0719 @wilfred-s If i make sense right, we don't need to add decrease for following state? 1. Rejected 2. Failed 3. Completed And for expired, i don't think we need to tracking it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724561831 ## pkg/metrics/scheduler.go: ## @@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics { Help: "Total number of attempts to allocate containers. State of the attempt includes `allocated`, `rejected`, `error`, `released`", }, []string{"state"}) - s.applicationSubmission = prometheus.NewCounterVec( - prometheus.CounterOpts{ + s.applicationSubmission = prometheus.NewGaugeVec( Review Comment: Thanks @wilfred-s for clarify for this field,it makes sense to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
wilfred-s commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724541474 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: Rejected applications should only increase, never decrease: a simple counter. It is a final state that the application only gets removed from as part of the cleanup. We keep it around for a while to make it traceable for the submitter. i.e. submit an application that gets rejected: find it is the list as such. If we would drop it immediately the submitter has no idea what happened. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
wilfred-s commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724534428 ## pkg/metrics/scheduler.go: ## @@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics { Help: "Total number of attempts to allocate containers. State of the attempt includes `allocated`, `rejected`, `error`, `released`", }, []string{"state"}) - s.applicationSubmission = prometheus.NewCounterVec( - prometheus.CounterOpts{ + s.applicationSubmission = prometheus.NewGaugeVec( Review Comment: A submission can not be changed. When an application is submitted it will always be submitted. It should be a counter that keeps increasing. It is not a state that we track it is a pure counter from start to shutdown. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
chenyulin0719 commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1724471529 ## pkg/metrics/queue.go: ## @@ -177,6 +203,10 @@ func (m *QueueMetrics) IncQueueApplicationsRejected() { m.incQueueApplications(AppRejected) } +func (m *QueueMetrics) DecQueueApplicationsRejected() { + m.decQueueApplications(AppRejected) +} Review Comment: DecQueueApplicationsRejected() is not called in "leave_Rejected" state? Is it intentional? ## pkg/metrics/queue.go: ## @@ -199,10 +267,31 @@ func (m *QueueMetrics) GetQueueApplicationsFailed() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsCompleting() { + m.incQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) DecQueueApplicationsCompleting() { + m.decQueueApplications(AppCompleting) +} + +func (m *QueueMetrics) GetQueueApplicationsCompleting() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppCompleting).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsCompleted() { m.incQueueApplications(AppCompleted) } +func (m *QueueMetrics) DecQueueApplicationsCompleted() { + m.decQueueApplications(AppCompleted) +} Review Comment: DecQueueApplicationsCompleted() is not called in "leave_completed" state too. ## pkg/metrics/scheduler.go: ## @@ -287,16 +305,49 @@ func (m *SchedulerMetrics) GetTotalApplicationsRunning() (int, error) { return -1, err } +func (m *SchedulerMetrics) IncTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailing() { + m.application.WithLabelValues(AppFailing).Dec() +} + func (m *SchedulerMetrics) IncTotalApplicationsFailed() { m.application.WithLabelValues(AppFailed).Inc() } -func (m *SchedulerMetrics) IncTotalApplicationsCompleted() { - m.application.WithLabelValues(AppCompleted).Inc() +func (m *SchedulerMetrics) IncTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsCompleting() { + m.application.WithLabelValues(AppCompleting).Dec() +} + +func (m *SchedulerMetrics) IncTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsResuming() { + m.application.WithLabelValues(AppResuming).Dec() } -func (m *SchedulerMetrics) AddTotalApplicationsCompleted(value int) { - m.application.WithLabelValues(AppCompleted).Add(float64(value)) +func (m *SchedulerMetrics) GetTotalApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.application.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *SchedulerMetrics) DecTotalApplicationsFailed() { + m.application.WithLabelValues(AppFailed).Dec() +} Review Comment: nit: Could we move DecTotalApplicationsFailed() below IncTotalApplicationsFailed()? ## pkg/metrics/queue.go: ## @@ -186,10 +216,48 @@ func (m *QueueMetrics) GetQueueApplicationsRejected() (int, error) { return -1, err } +func (m *QueueMetrics) IncQueueApplicationsResuming() { + m.incQueueApplications(AppResuming) +} + +func (m *QueueMetrics) DecQueueApplicationsResuming() { + m.decQueueApplications(AppResuming) +} + +func (m *QueueMetrics) GetQueueApplicationsResuming() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppResuming).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + +func (m *QueueMetrics) IncQueueApplicationsFailing() { + m.incQueueApplications(AppFailing) +} + +func (m *QueueMetrics) DecQueueApplicationsFailing() { + m.decQueueApplications(AppFailing) +} + +func (m *QueueMetrics) GetQueueApplicationsFailing() (int, error) { + metricDto := &dto.Metric{} + err := m.appMetricsLabel.WithLabelValues(AppFailing).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *QueueMetrics) IncQueueApplicationsFailed() { m.incQueueApplications(AppFailed) } +func (m *QueueMetrics) DecQueueApplicationsFailed() { + m.decQueueApplications(AppFailed) +} Review Comment: DecQueueApplicationsFailed() is not called in "leave_Failed" state too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
codecov[bot] commented on PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2300433659 ## [Codecov](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report Attention: Patch coverage is `33.77483%` with `100 lines` in your changes missing coverage. Please review. > Project coverage is 79.06%. Comparing base [(`9f46b81`)](https://app.codecov.io/gh/apache/yunikorn-core/commit/9f46b81698a43798e87f9edaac37631b9f592a1e?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) to head [(`e413ba6`)](https://app.codecov.io/gh/apache/yunikorn-core/commit/e413ba6a7ecc1055cb69b77eeaf663075855218f?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). > Report is 1 commits behind head on master. | [Files](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Patch % | Lines | |---|---|---| | [pkg/metrics/queue.go](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&filepath=pkg%2Fmetrics%2Fqueue.go&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGtnL21ldHJpY3MvcXVldWUuZ28=) | 0.00% | [56 Missing :warning: ](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | | [pkg/metrics/scheduler.go](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&filepath=pkg%2Fmetrics%2Fscheduler.go&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGtnL21ldHJpY3Mvc2NoZWR1bGVyLmdv) | 4.34% | [44 Missing :warning: ](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Additional details and impacted files ```diff @@Coverage Diff @@ ## master #951 +/- ## == - Coverage 79.56% 79.06% -0.50% == Files 97 97 Lines 1227512393 +118 == + Hits 9767 9799 +32 - Misses 2231 2316 +85 - Partials 277 278 +1 ``` [:umbrella: View full report in Codecov by Sentry](https://app.codecov.io/gh/apache/yunikorn-core/pull/951?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). :loudspeaker: Have feedback on the report? [Share it here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2300374500 > Please run `make lint` before submitting PRs. Fix go lint now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
craigcondit commented on PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#issuecomment-2299389949 Please run `make lint` before submitting PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722813874 ## pkg/metrics/scheduler.go: ## @@ -84,8 +84,8 @@ func InitSchedulerMetrics() *SchedulerMetrics { Help: "Total number of attempts to allocate containers. State of the attempt includes `allocated`, `rejected`, `error`, `released`", }, []string{"state"}) - s.applicationSubmission = prometheus.NewCounterVec( - prometheus.CounterOpts{ + s.applicationSubmission = prometheus.NewGaugeVec( Review Comment: Change to gauge because, we want to support dec also -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722812390 ## pkg/metrics/scheduler.go: ## @@ -274,10 +296,6 @@ func (m *SchedulerMetrics) DecTotalApplicationsRunning() { m.application.WithLabelValues(AppRunning).Dec() } -func (m *SchedulerMetrics) SubTotalApplicationsRunning(value int) { - m.application.WithLabelValues(AppRunning).Sub(float64(value)) Review Comment: Same with above, remove this duplicated logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [YUNIKORN-2818] State of appMetrics of Queue Metrics is incomplete an… [yunikorn-core]
zhuqi-lucas commented on code in PR #951: URL: https://github.com/apache/yunikorn-core/pull/951#discussion_r1722811773 ## pkg/metrics/scheduler.go: ## @@ -241,27 +241,49 @@ func (m *SchedulerMetrics) GetSchedulingErrors() (int, error) { return -1, err } +func (m *SchedulerMetrics) IncTotalApplicationsNew() { + m.applicationSubmission.WithLabelValues(AppNew).Inc() +} + +func (m *SchedulerMetrics) DecTotalApplicationsNew() { + m.applicationSubmission.WithLabelValues(AppNew).Dec() +} + +func (m *SchedulerMetrics) GetTotalApplicationsNew() (int, error) { + metricDto := &dto.Metric{} + err := m.applicationSubmission.WithLabelValues(AppNew).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err +} + func (m *SchedulerMetrics) IncTotalApplicationsAccepted() { m.applicationSubmission.WithLabelValues(AppAccepted).Inc() } -func (m *SchedulerMetrics) AddTotalApplicationsAccepted(value int) { - m.applicationSubmission.WithLabelValues(AppAccepted).Add(float64(value)) +func (m *SchedulerMetrics) DecTotalApplicationsAccepted() { + m.applicationSubmission.WithLabelValues(AppAccepted).Dec() } -func (m *SchedulerMetrics) IncTotalApplicationsRejected() { - m.applicationSubmission.WithLabelValues(ContainerRejected).Inc() +func (m *SchedulerMetrics) GetTotalApplicationsAccepted() (int, error) { + metricDto := &dto.Metric{} + err := m.applicationSubmission.WithLabelValues(AppAccepted).Write(metricDto) + if err == nil { + return int(*metricDto.Gauge.Value), nil + } + return -1, err } -func (m *SchedulerMetrics) AddTotalApplicationsRejected(value int) { - m.applicationSubmission.WithLabelValues(ContainerRejected).Add(float64(value)) Review Comment: The name was wrong, it should be applicationReject, and also remove this logic because it will be all handled in state machine changes, it will be duplicated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org