[
https://issues.apache.org/jira/browse/YUNIKORN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799194#comment-17799194
]
Weiwei Yang commented on YUNIKORN-2280:
---------------------------------------
Hi [~ccondit] I think another angle of this problem is we need to review what
API calls are behind this.
E.g when we send events to k8s, we do rate limiting some places otherwise it
may be too overwhelming. I am unsure if this might be related to that, maybe
somewhere we send too many events?
> Possible memory leak in scheduler
> ---------------------------------
>
> Key: YUNIKORN-2280
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2280
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Affects Versions: 1.3.0, 1.4.0
> Environment: EKS 1.24, we observed same behavior with YK 1.3.0 & 1.4.0
> Reporter: Timothy Potter
> Priority: Major
> Attachments: goroutine-dump.out, heap-dump-1001.out,
> heap-dump-1255.out, yunikor-scheduler-process-memory.png,
> yunikorn-process-memory-last9hours.png, yunikorn-scheduler-goroutines.png
>
>
> Memory for our scheduler pod slowly increases until it gets killed by kubelet
> for surpassing its memory limit.
> I've included two heap dump files collected about 3 hours apart, see process
> memory chart for the same period. Not really sure what to make of these heap
> dumps so hoping someone else who knows the code better might have some
> insights?
> from heap-dump-1001.out:
> {code}
> flat flat% sum% cum cum%
> 1.46GB 24.68% 24.68% 1.46GB 24.68% reflect.unsafe_NewArray
> 1.30GB 21.94% 46.63% 1.32GB 22.35%
> sigs.k8s.io/json/internal/golang/encoding/json.(*decodeState).literalStore
> 1.06GB 17.96% 64.58% 1.06GB 17.96%
> k8s.io/apimachinery/pkg/apis/meta/v1.(*FieldsV1).UnmarshalJSON
> 0.88GB 14.87% 79.45% 0.88GB 14.87% reflect.mapassign_faststr0
> {code}
> from heap-dump-1255.out:
> {code}
> flat flat% sum% cum cum%
> 1756.18MB 23.53% 23.53% 1756.18MB 23.53% reflect.unsafe_NewArray
> 1612.36MB 21.60% 45.13% 1645.86MB 22.05%
> sigs.k8s.io/json/internal/golang/encoding/json.(*decodeState).literalStore
> 1359.86MB 18.22% 63.35% 1359.86MB 18.22%
> k8s.io/apimachinery/pkg/apis/meta/v1.(*FieldsV1).UnmarshalJSON
> 1136.40MB 15.22% 78.57% 1136.40MB 15.22% reflect.mapassign_faststr0
> {code}
> We also see odd spikes in the # of goroutines but that doesn't seem
> correlated with the increase in memory (mainly just mentioning this in case
> it's unexpected)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]