[jira] [Resolved] (YUNIKORN-333) Reduce the number events published to K8s event system when predicate fails

Wilfred Spiegelenburg (Jira) Tue, 16 Mar 2021 01:23:04 -0700


     [ 
https://issues.apache.org/jira/browse/YUNIKORN-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wilfred Spiegelenburg resolved YUNIKORN-333.
--------------------------------------------
    Fix Version/s: 0.10
       Resolution: Fixed

Change was added to rate limit to 1 per second for these events.

Committed to branch-0.10

> Reduce the number events published to K8s event system when predicate fails
> ---------------------------------------------------------------------------
>
>                 Key: YUNIKORN-333
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-333
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: core - scheduler
>            Reporter: Adam Antal
>            Assignee: Ting Yao,Huang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10
>
>
> The problem today is we are publishing too many events to K8s.
> If you look at the code: 
> https://github.com/apache/incubator-yunikorn-k8shim/blob/86cc199c00d44c1dde71c9f2faf5bc17ff28bbb7/pkg/plugin/predicates/predictor.go#L303-L304,
>  this is called in the core scheduling logic upon each allocation, which 
> could happen thousands of times per sec. For example, if a pod could not be 
> allocated onto any of the nodes due to some node taints, it runs a while and 
> we will see a huge number of dup events when we do "kubectl describe pod". So 
> this gives us:
>  - good: we do not lose any of events, all pushed to K8s
>  - bad: overhead to the K8s event system (but gladly it aggregates the dup 
> events)
> I think there are a few options we can evaluation:
> - Shall we cache such events via the event cache system, and then push them 
> in 1s interval just like what we have done for headRoom check?
> - Add some rate-limit mechanism to reduce number of dup events
> could you pls take a look and let me know your thought. thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (YUNIKORN-333) Reduce the number events published to K8s event system when predicate fails

Reply via email to