[jira] [Commented] (YUNIKORN-829) Produce metrics on queue-level resource utilization

Weiwei Yang (Jira) Thu, 26 Aug 2021 21:47:09 -0700


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405581#comment-17405581
 ]


Weiwei Yang commented on YUNIKORN-829:
--------------------------------------

hi [~yuchaoran2011] for the metrics collection, if we build that into the 
k8shim, so there will be no concern about this is k8s only solution. Ideally, 
we can define a metrics collector interface, one implementation can be metrics 
server-based. Pls let us know if you have any further ideas about this. Thx

> Produce metrics on queue-level resource utilization
> ---------------------------------------------------
>
>                 Key: YUNIKORN-829
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-829
>             Project: Apache YuniKorn
>          Issue Type: New Feature
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Chaoran Yu
>            Priority: Major
>
> YuniKorn already has metrics on the resources requested/allocated for each 
> queue. But we have no visibility into how much of the allocated resources are 
> actually being used. Take Spark as an example, an under-optimized job may 
> request 1 TB of total executor memory but the actual processing logic only 
> uses 100 GB. This has the consequence that other jobs might not be able to 
> fit in the queue. Having a metric that shows the real utilization will help 
> members of a queue better understand their job characteristics and optimize 
> the jobs.
> K8s metrics server has metrics on real utilization. YK may be able to perform 
> some aggregations to arrive at the stats at the queue level. This is a 
> k8s-specific solution though.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YUNIKORN-829) Produce metrics on queue-level resource utilization

Reply via email to