pbacsko commented on code in PR #1043:
URL: https://github.com/apache/yunikorn-core/pull/1043#discussion_r2480506675
##########
pkg/scheduler/objects/application.go:
##########
@@ -1460,6 +1469,171 @@ func (sa *Application) tryNodesNoReserve(ask
*Allocation, iterator NodeIterator,
// Try all the nodes for a request. The resultType is an allocation or
reservation of a node.
// New allocations can only be reserved after a delay.
+func (sa *Application) tryNodesInParallel(ask *Allocation, iterator
NodeIterator, tryNodesThreadCount int) *AllocationResult { //nolint:funlen
+ var nodeToReserve *Node
+ scoreReserved := math.Inf(1)
+ allocKey := ask.GetAllocationKey()
+ reserved := sa.reservations[allocKey]
+ var allocResult *AllocationResult
+ var predicateErrors map[string]int
+
+ var mu sync.Mutex
+
+ // Channel to signal completion
+ done := make(chan struct{})
+ defer close(done)
+
+ // Function to process each batch
+ processBatch := func(batch []*Node) {
+ var wg sync.WaitGroup
+ semaphore := make(chan struct{}, tryNodesThreadCount)
+ candidateNodes := make([]*Node, len(batch))
+ errors := make([]error, len(batch))
+
+ for idx, node := range batch {
+ wg.Add(1)
+ semaphore <- struct{}{}
+ go func(idx int, node *Node) {
+ defer wg.Done()
+ defer func() { <-semaphore }()
+ dryRunResult, err := sa.tryNodeDryRun(node, ask)
+
+ mu.Lock()
+ defer mu.Unlock()
+ if err != nil {
+ errors[idx] = err
+ } else if dryRunResult != nil {
+ candidateNodes[idx] = node
+ }
+ }(idx, node)
+ }
Review Comment:
Interesting, but do I have a concern with this approach. What if a large
cluster (eg 5000 nodes), we have unschedulable pods? In this case, we'd create
5000 goroutines for a single request in every scheduling cycle. If we have 10
unschedulable pods, that's 50000 gorutines - once per cycle. Overall, 500k
goroutines per second.
Goroutines are cheap, but not free.
This might be an extreme, but we have to think about extremes, even if
they're less common.
I'd definitely think about some sort of pooling solution, essentially worker
goroutines which are always running and waiting for asks to evaluate. Shouldn't
be hard to implement.
Anyway, I do have a simple test case which checks performance in the shim
under `pkg/shim/scheduling_perf_test.go`. It's called
`BenchmarkSchedulingThroughPut()`. This could be modified to submit
unschedulable pods (eg. ones with a node selector that never matches) to see
how it affects performance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]