[ 
https://issues.apache.org/jira/browse/YUNIKORN-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815456#comment-17815456
 ] 

Yongjun Zhang commented on YUNIKORN-2030:
-----------------------------------------

I added a bunch of diag logs and finally got a reproduction of the issue, and 
the logs helped to identify a missing headRoom check which I'm providing a PR.


With the two stack traces attached below, I did the following analysis to 
identify the bug and fix:
 # stackTraceA and stackTraceB, based on the code, they should from different 
scheduling cycle,  with time gap is 0.001 second, for different ask
 # The passed headRoom to stacktraceA is 27511488512, the allocation request to 
stacckTraceA is 23394779136, thus after this allocation, the resulted headRoom 
is 4116709376, which is exactly the headRoom passed to and seen by stackTraceB. 
 This means these two staclTraces are neighboring scheduling cycles.
 # with headRoom 27511488512, the allocation request of stackTraceA of 
23394779136 can be satisfied, however, with the resulted headRoom 4116709376 
from stackTraceA, stackTraceB failed with the reproduction symptom.
 # I suspect there may be a headRoom checking bug, because the resource  
request of  stackTraceB is 23394779136, which is larger than the resulted 
headRoom from stackTraceA ({*}4116709376{*} reported by stackTraceB): given 
*23394779136* is larger than *4116709376, how the earlier headRoom  checking 
can pass, so that this illegal state is entered.* But the headRoom checking 
code seems straightforward.
 #  This made me to suspect that headRoom checking was not done for this 
allocation, and I found that place. The PR shows where the headRoom checking is 
missing.

 

stackTraceA:
{code:java}
3845 2024-02-03T04:48:58.833Z        WARN    objects/queue.go:1070   Potential 
allocation causing inconsistent state A       {"allockey": 
"d622ab3f-d9ed-44ec-892d-329e8a6f6963", "application": 
"a28b08df-galaxypinj-runner-pin-manual--2      0-1", "headRoom": 
{"attachable-volumes-aws-ebs":23425,"ephemeral-storage":55532238435958,"hugepages-1Gi":0,"hugepages-2Mi":0,"hugepages-32Mi":0,"hugepages-64Ki":0,"memory":27511488512,"pods":48682,"vcore":109976},
 "leafQueue": "r      oot.org-core.homefeed-infra-workflow.tier2", 
"allocationRequest": "map[memory:23394779136 pods:1 vcore:2000]", "queueUsage": 
"map[memory:3301213995008 pods:142 vcore:284000]", "maxResource": 
"map[memory:3328725483520 vcore:393976      ]", "memHeadroom": 4116709376, 
"vcoreHeadroom": 107976, "rootQueueUsage": "map[memory:304276817575936 
pods:9015 vcore:21488000]", "rootMaxResource": 
"map[attachable-volumes-aws-ebs:23425 ephemeral-storage:55532238435958 
hugepages      -1Gi:0 hugepages-2Mi:0 hugepages-32Mi:0 hugepages-64Ki:0 
memory:315878420541440 pods:57696 vcore:38325770]"}
 3846 goroutine 96 [running]:
 3847 runtime/debug.Stack()
 3848         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/.go/src/runtime/debug/stack.go:24
 +0x64
 3849 runtime/debug.PrintStack()
 3850         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/.go/src/runtime/debug/stack.go:16
 +0x1c
 3851 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).IncAllocatedResourceDebug(0x402ce5be00,
 0x40d960ab40, 0x40a1f3cee0, 0x40d960ab50, 0x0)
 3852         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1084
 +0xec0
 3853 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryNodeDebug(0x40265a2f00,
 0x40d8756ba0?, 0x4043258f20, 0x40a1f3cee0)
 3854         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/application.go:1677
 +0x124
 3855 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryReservedAllocate(0x40265a2f00,
 0x40d960ab40, 0x4033155d78)
 3856         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/application.go:1295
 +0x1f4
 3857 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5be00,
 0x4016db2fc0?)
 3858         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1471
 +0x304
 3859 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5bc80,
 0x4016db2fc0?)
 3860         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3861 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5a300,
 0x24820?)
 3862         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3863 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402a903980,
 0x0?)
 3864         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3865 
github.com/apache/yunikorn-core/pkg/scheduler.(*PartitionContext).tryReservedAllocate(0x402cc0a9c0)
 3866         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/partition.go:814
 +0x90
 3867 
github.com/apache/yunikorn-core/pkg/scheduler.(*ClusterContext).schedule(0x4033155f80?)
 3868         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/context.go:131
 +0xf4
 3869 
github.com/apache/yunikorn-core/pkg/scheduler.(*Scheduler).internalSchedule(0x40003b5020)
 3870         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/scheduler.go:88
 +0xd4
 3871 created by 
github.com/apache/yunikorn-core/pkg/scheduler.(*Scheduler).StartService
 3872         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/scheduler.go:67
 +0x21c
 3873 2024-02-03T04:48:58.834Z        INFO    objects/application.go:865      
reservation removed successfully        {"node": 
"ip-10-181-118-197.ec2.internal", "app": "test-app", "ask": "d622ab3f-d9e      
d-44ec-892d-329e8a6f6963"}
 3874 2024-02-03T04:48:58.834Z        INFO    scheduler/partition.go:945      
allocation ask is unreserved    {"appID": "test-app", "queue": "test-queue", 
"allocationK      ey": "d622ab3f-d9ed-44ec-892d-329e8a6f6963", "node": 
"ip-10-181-118-197.ec2.internal", "reservationsRemoved": 1}
 3875 2024-02-03T04:48:58.834Z        INFO    scheduler/partition.go:891      
scheduler allocation processed  {"appID": "test-app", "allocationKey": 
"d622ab3f-d9ed-44ec-892d-329e8a6f6963", "uuid": "59      
7101e7-dfd7-44f7-86ed-5e694c6244b7", "allocatedResource": 
"map[memory:23394779136 pods:1 vcore:2000]", "placeholder": false, 
"targetNode": "ip-10-181-118-197.ec2.internal"}
  {code}
Notice the “{*}scheduler allocation processed”{*} at the end

stackTraceB:
{code:java}
3879 2024-02-03T04:48:58.835Z        WARN    objects/application.go:1679     
queue update failed unexpectedly        {"application": "test-app", "ask-app": 
"a28b08df-galaxypinj-runner-pin-manual--20-      1", "ask-id": 
"b604da5a-7d54-418b-8fff-b0682f9fb209", "error": "allocation 
(map[memory:23394779136 pods:1 vcore:2000]) puts queue 'test-queue' over 
maximum allocation (map[memory:3328725483520 vco      re:393976]), current 
usage (map[memory:3324608774144 pods:143 vcore:286000]) passed headRoom 
(map[attachable-volumes-aws-ebs:23425 ephemeral-storage:55532238435958 
hugepages-1Gi:0 hugepages-2Mi:0 hugepages-32Mi:0 hugepages-64Ki:0   
memory:4116709376 pods:48681 vcore:107976])"}
 3880 goroutine 96 [running]:
 3881 runtime/debug.Stack()
 3882         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/.go/src/runtime/debug/stack.go:24
 +0x64
 3883 runtime/debug.PrintStack()
 3884         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/.go/src/runtime/debug/stack.go:16
 +0x1c
 3885 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryNodeDebug(0x40265a2f00,
 0x14fd9e0?, 0x404374d080, 0x40c57c4000)
 3886         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/application.go:1684
 +0x3e4
 3887 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryNodesNoReserveDebug(0x4033155060?,
 0x40925ff050?, 0x40c57c4000, {0x19c8370, 0x40842a6c40}, {0x4003ca7f20, 0x1e})
 3888         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/application.go:1421
 +0xe0
 3889 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryReservedAllocate(0x40265a2f00,
 0x40d960adf0, 0x4033155d78)
 3890         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/application.go:1312
 +0x3a4
 3891 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5be00,
 0x4016db2fc0?)
 3892         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1471
 +0x304
 3893 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5bc80,
 0x4016db2fc0?)
 3894         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3895 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402ce5a300,
 0x24820?)
 3896         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3897 
github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Queue).TryReservedAllocate(0x402a903980,
 0x0?)
 3898         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/objects/queue.go:1490
 +0xd0
 3899 
github.com/apache/yunikorn-core/pkg/scheduler.(*PartitionContext).tryReservedAllocate(0x402cc0a9c0)
 3900         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/partition.go:814
 +0x90
 3901 
github.com/apache/yunikorn-core/pkg/scheduler.(*ClusterContext).schedule(0x4033155f80?)
 3902         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/context.go:131
 +0xf4
 3903 
github.com/apache/yunikorn-core/pkg/scheduler.(*Scheduler).internalSchedule(0x40003b5020)
 3904         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/scheduler.go:88
 +0xd4
 3905 created by 
github.com/apache/yunikorn-core/pkg/scheduler.(*Scheduler).StartService
 3906         
/mnt/jenkins-data/workspace/build-yunikorn-docker-image-worker/yunikorn-core/pkg/scheduler/scheduler.go:67
 +0x21c
  {code}

> Need to check headroom when trying other nodes for reserved allocations
> -----------------------------------------------------------------------
>
>                 Key: YUNIKORN-2030
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2030
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>            Priority: Major
>
> As reported in YUNIKORN-1996, we are seeing many messages like below from 
> time to time:
> {code:java}
>  WARN    objects/application.go:1504     queue update failed unexpectedly     
>    {“error”: “allocation (map[memory:37580963840 pods:1 vcore:2000]) puts 
> queue ‘root.test-queue’ over maximum allocation (map[memory:3300011278336 
> vcore:390584]), current usage (map[memory:3291983380480 pods:91 
> vcore:186000])“}{code}
> Restarting Yunikorn helps stoppinging it. Creating this Jira to investigate 
> why it happened, because it's not supposed to happen as we check if there is 
> enough resource headroom before calling 
>  
> {code:java}
> func (sa *Application) tryNode(node *Node, ask *AllocationAsk) *Allocation 
> {code}
> which printed the above message, and only call it when there is enough 
> headroom.
> There maybe a bug in headroom checking?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to