Re: [PR] [YUNIKORN-3118] Parallelize TryNode evaluations [yunikorn-core]

via GitHub Fri, 31 Oct 2025 11:34:33 -0700


mitdesai commented on code in PR #1043:
URL: https://github.com/apache/yunikorn-core/pull/1043#discussion_r2482351690



##########
pkg/scheduler/objects/application.go:
##########
@@ -1460,6 +1469,171 @@ func (sa *Application) tryNodesNoReserve(ask 
*Allocation, iterator NodeIterator,
 
 // Try all the nodes for a request. The resultType is an allocation or 
reservation of a node.
 // New allocations can only be reserved after a delay.
+func (sa *Application) tryNodesInParallel(ask *Allocation, iterator 
NodeIterator, tryNodesThreadCount int) *AllocationResult { //nolint:funlen
+       var nodeToReserve *Node
+       scoreReserved := math.Inf(1)
+       allocKey := ask.GetAllocationKey()
+       reserved := sa.reservations[allocKey]
+       var allocResult *AllocationResult
+       var predicateErrors map[string]int
+
+       var mu sync.Mutex
+
+       // Channel to signal completion
+       done := make(chan struct{})
+       defer close(done)
+
+       // Function to process each batch
+       processBatch := func(batch []*Node) {
+               var wg sync.WaitGroup
+               semaphore := make(chan struct{}, tryNodesThreadCount)
+               candidateNodes := make([]*Node, len(batch))
+               errors := make([]error, len(batch))
+
+               for idx, node := range batch {
+                       wg.Add(1)
+                       semaphore <- struct{}{}
+                       go func(idx int, node *Node) {
+                               defer wg.Done()
+                               defer func() { <-semaphore }()
+                               dryRunResult, err := sa.tryNodeDryRun(node, ask)
+
+                               mu.Lock()
+                               defer mu.Unlock()
+                               if err != nil {
+                                       errors[idx] = err
+                               } else if dryRunResult != nil {
+                                       candidateNodes[idx] = node
+                               }
+                       }(idx, node)
+               }

Review Comment:
   Hi @pbacsko, we do not create goroutines for each node. The parallelism is 
driven by the configuration: 'tryNodesThreadCount'
   
   We will evaluate the nodes in a batches based on the threadCount. We also 
use the same node ordering that YuniKorn already has, so least used nodes are 
evaluated first.
   
   Currently in our existing setup we use 100 threads for a 700 node cluster. 
This decreases the time for node evaluation at the cost of extra CPU.
   
   Regarding an unschedulable  pod in the system, even without the parallelism 
we will try each node for it in every cycle before bailing out. This process 
will just reduce the time taken for node evaluation in such case. Given that 
the threads count is set appropriately. Too many threads = more overhead for 
managing the threads. Too little = more time evaluating the nodes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [YUNIKORN-3118] Parallelize TryNode evaluations [yunikorn-core]

Reply via email to