zhuqi-lucas opened a new pull request, #582: URL: https://github.com/apache/yunikorn-k8shim/pull/582
### What is this PR for? Under some circumstances, it seems that placeholder allocations are being removed multiple times: ``` 2023-04-25T06:25:46.279Z INFO scheduler/partition.go:1233 replacing placeholder allocation {"appID": "spark-000000031tn2lgv2gar", "allocationId": "20a4cf77-7095-4635-b9e9-43a7564385c4"} ... 2023-04-25T06:25:46.299Z INFO scheduler/partition.go:1233 replacing placeholder allocation {"appID": "spark-000000031tn2lgv2gar", "allocationId": "20a4cf77-7095-4635-b9e9-43a7564385c4"} ``` This message only appears once in the codebase, in PartitionContext.removeAllocation(). Furthermore, it is guarded by a test for release.TerminationType == si.TerminationType_PLACEHOLDER_REPLACED. This would seem to indicate that removeAllocation() is somehow being called twice. I believe this would cause the used resources on the node to be subtracted twice for the same allocation. This quickly results in health checks failing: ``` 2023-04-25T06:26:10.632Z WARN scheduler/health_checker.go:176 Scheduler is not healthy {"health check values": [..., {"Name":"Consistency of data","Succeeded":false,"Description":"Check if node total resource = allocated resource + occupied resource + available resource","DiagnosisMessage":"Nodes with inconsistent data: [\"ip-10-0-112-148.eu-central-1.compute.internal\"]"}, ...]} ``` ### What type of PR is it? * [ ] - Bug Fix * [ ] - Improvement * [ ] - Feature * [ ] - Documentation * [ ] - Hot Fix * [ ] - Refactoring ### Todos * [ ] - Task ### What is the Jira issue? * Open an issue on Jira https://issues.apache.org/jira/browse/YUNIKORN/ * Put link here, and add [YUNIKORN-*Jira number*] in PR title, eg. `[YUNIKORN-2] Gang scheduling interface parameters` ### How should this be tested? ### Screenshots (if appropriate) ### Questions: * [ ] - The licenses files need update. * [ ] - There is breaking changes for older versions. * [ ] - It needs documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org