Victor Zhou created YUNIKORN-3232:
-------------------------------------

             Summary: Preemption latency regression
                 Key: YUNIKORN-3232
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3232
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
            Reporter: Victor Zhou
             Fix For: 1.9.0


*Summary*

Preemption latency regression in 1.8:

oversized queue snapshots are cloned twice in calculateVictimsByNode

 

*Description*
After upgrading from 1.5 to 1.8, we observed a major preemption latency 
increase in production-like workloads.


In our environment:
* total queues: 400+
* queues that are actually valid victim candidates per cycle: typically 20–50

observed preemption latency:
* ~0.5s on 1.5
* 10+ seconds on 1.8

The latency is more noticeable as queue count grows.

 

*Root Cause*

The preemption queue snapshot map becomes too large because it includes queues 
that are already within guaranteed resource limits.
Those queues should not be targeted as victims, and therefore should not be 
included in the working snapshot set for victim selection.
This oversized snapshot map is then cloned twice in calculateVictimsByNode 
(first and second pass), amplifying overhead and causing significant latency 
regression at scale.

 

*Fix / Improvement*
Prune preemption snapshots to keep only relevant paths:
* ask queue path and ancestors
* victim-contributing leaf queues and ancestors
* Remove leaf snapshots early when queue is within guaranteed limits



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to