Mit Desai created YUNIKORN-3122:
-----------------------------------

             Summary: Optimize Node Evaluation by Pre-filtering Tainted Nodes 
Based on Pod Tolerations
                 Key: YUNIKORN-3122
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-3122
             Project: Apache YuniKorn
          Issue Type: New Feature
            Reporter: Mit Desai
            Assignee: Mit Desai


h3. Summary

Implement intelligent node pre-filtering in YuniKorn to avoid evaluating nodes 
with taints that don't match pod tolerations during scheduling cycles. This 
optimization will significantly improve scheduling performance in large 
clusters with mixed tainted and untainted nodes.
h3. Background

In large Kubernetes clusters with node taints for workload isolation (e.g., 
different nodepools for security, compliance, or resource requirements), 
YuniKorn currently evaluates all nodes during each scheduling cycle regardless 
of whether the pod being scheduled has the necessary tolerations. This leads to 
significant performance overhead as the scheduler:
 # {*}Evaluates Incompatible Nodes{*}: Processes nodes that will ultimately be 
rejected due to taint mismatches
 # {*}Wastes CPU Cycles{*}: Performs unnecessary predicate checks on unsuitable 
nodes
 # {*}Increases Scheduling Latency{*}: Adds overhead proportional to the number 
of tainted nodes in the cluster

h4. Real-World Impact
 * {*}Cluster Size{*}: 700+ nodes with ~50% having nodepool taints
 * {*}Current Behavior{*}: Scheduler evaluates all 700 nodes per cycle
 * {*}Optimal Behavior{*}: Should only evaluate ~350 compatible nodes per cycle
 * {*}Performance Gain{*}: Potential 50% reduction in node evaluation overhead

YuniKorn's current node evaluation logic:
{code:go}
// Current inefficient approach
for each_node in cluster {
    evaluate_node_predicates(node, pod)  // Includes taint/toleration check
    if predicates_pass {
        attempt_allocation(node, pod)
    }
}
{code}
This approach evaluates nodes that are guaranteed to fail taint/toleration 
checks, wasting computational resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to