Jie Ke created YUNIKORN-1615:
--------------------------------
Summary: Node used resource is negative
Key: YUNIKORN-1615
URL: https://issues.apache.org/jira/browse/YUNIKORN-1615
Project: Apache YuniKorn
Issue Type: Bug
Components: core - scheduler
Affects Versions: 1.2.0, 1.1.0
Environment: Kubekubernetes 1.20.8
Reporter: Jie Ke
After some tasks complete, the Yunikorn scheduler reported node used resource
with negative resource and it cause the scheduling in chaos. I tried to restart
the scheduler and it will report negative resource eventually after complete
some tasks. In Yunikorn scheduler log I found the following log:
{code:java}
2023-03-01T18:10:40.038Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.234", "request":
{"nodes":[{"nodeID":"172.18.45.234","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":520021631754},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":201131376640},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-10126244160},"vcore":{"value":-9700}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:10:44.635Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.228", "request":
{"nodes":[{"nodeID":"172.18.45.228","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":249175645796},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":269682475008},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-10314987840},"vcore":{"value":-9400}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:10:44.870Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.230", "request":
{"nodes":[{"nodeID":"172.18.45.230","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":249175645796},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":269682475008},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-8829204224},"vcore":{"value":-8500}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:10:49.279Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.235", "request":
{"nodes":[{"nodeID":"172.18.45.235","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":520021631754},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":201131372544},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-8504048512},"vcore":{"value":-7800}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:15:42.686Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.230", "request":
{"nodes":[{"nodeID":"172.18.45.230","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":249175645796},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":269682475008},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-9902946048},"vcore":{"value":-9500}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:15:43.857Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.234", "request":
{"nodes":[{"nodeID":"172.18.45.234","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":520021631754},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":201131376640},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-11199985984},"vcore":{"value":-10700}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:15:49.229Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.235", "request":
{"nodes":[{"nodeID":"172.18.45.235","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":520021631754},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":201131372544},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-9577790336},"vcore":{"value":-8800}}}}],"rmID":"k8s_dios"}}
2023-03-01T18:15:54.457Z INFO cache/nodes.go:140 report occupied
resources updates {"node": "172.18.45.228", "request":
{"nodes":[{"nodeID":"172.18.45.228","action":2,"attributes":{"ready":"true"},"schedulableResource":{"resources":{"ephemeral-storage":{"value":249175645796},"hugepages-1Gi":{},"hugepages-2Mi":{},"memory":{"value":269682475008},"pods":{"value":110},"vcore":{"value":40000}}},"occupiedResource":{"resources":{"memory":{"value":-11388729664},"vcore":{"value":-10400}}}}],"rmID":"k8s_dios"}}{code}
Kubekubernetes version
Server Version: [version.Info|http://version.info/]{Major:"1", Minor:"20",
GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a",
GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13",
Compiler:"gc", Platform:"linux/amd64"}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]