Weiwei Yang created YUNIKORN-1070:
-------------------------------------
Summary: Potential scheduler memory leak
Key: YUNIKORN-1070
URL: https://issues.apache.org/jira/browse/YUNIKORN-1070
Project: Apache YuniKorn
Issue Type: Bug
Reporter: Weiwei Yang
Ben mentioned this in the slack, he runs 0.12.2 on EKS and runs into periodic
OOM cases for the scheduler in EKS after a few days. Currently, the scheduler
is configured for 10GB of memory and eventually always seems to run out of
memory. In my environment, I have a lot of nodes coming in and out of the
cluster due to autoscaling. Wondering if this could be a possible reason or
if you guys have any other ideas. Let me know what kind of troubleshooting
information might be useful here, but there is just a continuous growth of
memory consumption that ends with OOMKilled.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]