[jira] [Comment Edited] (YUNIKORN-2837) Log & Send Events, Improve logging

Craig Condit (Jira) Wed, 28 Aug 2024 10:34:26 -0700


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877479#comment-17877479
 ]


Craig Condit edited comment on YUNIKORN-2837 at 8/28/24 5:33 PM:
-----------------------------------------------------------------

The {{markPreemption}} flag is purposely only used to mark pods that have been 
asked to preempt. In other words, the termination signal has been sent to the 
shim (and therefore Kubernetes), so the pod *will* terminate at some point and 
should not be considered again for preemption.
{quote}How many times preemption attempted for the given request and returned 
before even reaching the end point? Can we log these attempts in allocationLog?
{quote}
I'm not sure how useful this is from the perspective of the request; number of 
preemption attempts is simply the number of times we failed to schedule due to 
lack of resources; almost always at least {{TryPreemption() }}will be called, 
even if it short-circuits out. {{AllocationLog}} is for tracking why a pod was 
unschedulable; lack of resources is the reason. Preemption is an additional 
step that is IMO *extraordinary means* and not particular to any particular 
request. 
{quote}Should we send events when checkPremeptionQueueGuarantees check fails?
{quote}
Absolutely not. This is going to happen *often* – i.e. whenever a queue is 
encountered that is above or below quota. Sending an event for each of these 
will absolutely DoS the event system.

 
{quote}Improve logging, especially debug statements in general
{quote}
Preemption checks are hot path. We should not log (even at DEBUG) for anything 
that is repeated on a per-pod basis. Keep in mind that preemption runs for 
*every unschedulable request against every currently-running allocation.* This 
is a *lot* of calls, and even at DEBUG makes it nearly useless.

 

In short, I think this Jira is ill-conceived, and we should not implement it. 

 


was (Author: ccondit):
The {{markPreemption}} flag is purposely only used to mark pods that have been 
asked to preempt. In other words, the termination signal has been sent to the 
shim (and therefore Kubernetes), so the pod *will* terminate at some point and 
should not be considered again for preemption.
{quote}How many times preemption attempted for the given request and returned 
before even reaching the end point? Can we log these attempts in 
allocationLog?{quote}
I'm not sure how useful this is from the perspective of the request; number of 
preemption attempts is simply the number of times we failed to schedule due to 
lack of resources; almost always at least {{TryPreemption() }}will be called, 
even if it short-circuits out. {{AllocationLog}} is for tracking why a pod was 
unschedulable; lack of resources is the reason. Preemption is an additional 
step that is IMO *extraordinary means* and not particular to any particular 
request. 
{quote}Should we send events when checkPremeptionQueueGuarantees check fails?
{quote}
Absolutely not. This is going to happen *often* – i.e. whenever a queue is 
encountered that is above or below quota. Sending an event for each of these 
will absolutely DoS the event system.

 
{quote}Improve logging, especially debug statements in general
{quote}
Preemption checks are hot path. We should not log (even at DEBUG) for anything 
that is repeated on a per-pod basis. Keep in mind that preemption runs for 
*every unschedulable request against every currently-running allocation.* This 
is a *lot* of calls, and even at DEBUG makes it nearly useless.

 

In short, I think this Jira is ill-conceived, and we should not implement it. 

 

> Log & Send Events, Improve logging
> ----------------------------------
>
>                 Key: YUNIKORN-2837
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2837
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>            Reporter: Manikandan R
>            Assignee: Manikandan R
>            Priority: Major
>              Labels: pull-request-available
>
> As of now, {{markPreemption}} flag has been set only at the end of the 
> preemption process. There are many other situations to know the current state 
> of preemption for the given request. For example,
>  # How many times preemption attempted for the given request and returned 
> before even reaching the end point? Can we log these attempts in 
> allocationLog?
>  # Should we send events when checkPremeptionQueueGuarantees check fails?
>  # Improve logging, especially debug statements in general



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YUNIKORN-2837) Log & Send Events, Improve logging

Reply via email to