Re: [PR] [YUNIKORN-2790] Only check requested resource type of allocations [yunikorn-core]

via GitHub Wed, 07 Aug 2024 19:49:18 -0700


wilfred-s commented on code in PR #933:
URL: https://github.com/apache/yunikorn-core/pull/933#discussion_r1708475103



##########
pkg/scheduler/objects/queue.go:
##########
@@ -1017,17 +1016,18 @@ func (sq *Queue) IncAllocatedResource(alloc 
*resources.Resource, nodeReported bo
        sq.Lock()
        defer sq.Unlock()
        // all OK update this queue
-       sq.allocatedResource = newAllocated
+       sq.allocatedResource = resources.Add(sq.allocatedResource, alloc)
        sq.updateAllocatedResourceMetrics()
        return nil
 }
 
+// allocatedResFits adds the passed in resource to the allocatedResource of 
the queue and checks if it still fits in the
+// queues' maximum. If the resource fits it returns true otherwise false.
 // small helper method to access sq.maxResource+sq.allocatedResource and avoid 
Clone() call
-func (sq *Queue) allocatedResFits(res *resources.Resource) (bool, 
*resources.Resource) {
+func (sq *Queue) allocatedResFits(res *resources.Resource) bool {
        sq.RLock()
        defer sq.RUnlock()
-       newAllocated := resources.Add(sq.allocatedResource, res)

Review Comment:
   > That means allocatedResource could be larger than maxResource, and in that 
case all allocations can't be fit into that queue even though they don't use 
the resource type.
   
   The change allows a new allocation to be added ONLY if all the resource 
types requested in `allocatedResource` fit in the maximum set. It is not a 
generic option to exceed the maximum. If the queue is already over its maximum 
for a resource type that is NOT requested the allocation will proceed. 
Currently that will be blocked.
   
   > If above description is correct, is the situation (allocatedResource is 
larger than maxResource) acceptable in YK? or it is a temporary state?
   
   No a single allocation can never be larger than the max set on a queue. Even 
the case that a requested resource type would go over is still blocked.
   Examples 1:
   - queue max is 10000 vcore, 5 GPU
   - queue usage is 8000 vcore, 6 GPU (I lowered the configured max for GPU)
   - request is for 1000 vcore
   
   Currently: allocation is blocked (queue is always considered over quota)
   New behaviour: allocation is allowed
   
   Examples 2:
   - queue max is 10000 vcore, 5 GPU
   - queue usage is 8000 vcore, 5 GPU
   - request is for 3000 vcore
   
   Current and new behaviour: allocation is blocked queue would be over quota 
for vcore
   
   Examples 3:
   - root queue max is 10000 vcore, 0 GPU (no nodes with GPUs are registered yet
   - root queue usage is 8000 vcore, 1 GPU (old GPU job from before the kubelet 
restart
   - request is for 1000 vcore
   
   Currently: all allocations in the system are blocked as the root queue is 
over quota
   New behaviour: allocation is allowed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [YUNIKORN-2790] Only check requested resource type of allocations [yunikorn-core]

Reply via email to