[ https://issues.apache.org/jira/browse/YUNIKORN-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799238#comment-17799238 ]
Craig Condit commented on YUNIKORN-2280: ---------------------------------------- The number of events emitted is directly proportional to the number of pods we process. “Rate-limiting” them will only result in them not being sent or not sent in a timely manner. If a cluster is busy enough to process a given number of pods for scheduling, then it must also be configured to support the events that traffic generates, full stop. > Possible memory leak in scheduler > --------------------------------- > > Key: YUNIKORN-2280 > URL: https://issues.apache.org/jira/browse/YUNIKORN-2280 > Project: Apache YuniKorn > Issue Type: Bug > Components: core - scheduler > Affects Versions: 1.3.0, 1.4.0 > Environment: EKS 1.24, we observed same behavior with YK 1.3.0 & 1.4.0 > Reporter: Timothy Potter > Priority: Major > Attachments: goroutine-dump.out, heap-dump-1001.out, > heap-dump-1255.out, yunikor-scheduler-process-memory.png, > yunikorn-process-memory-last9hours.png, yunikorn-scheduler-goroutines.png > > > Memory for our scheduler pod slowly increases until it gets killed by kubelet > for surpassing its memory limit. > I've included two heap dump files collected about 3 hours apart, see process > memory chart for the same period. Not really sure what to make of these heap > dumps so hoping someone else who knows the code better might have some > insights? > from heap-dump-1001.out: > {code} > flat flat% sum% cum cum% > 1.46GB 24.68% 24.68% 1.46GB 24.68% reflect.unsafe_NewArray > 1.30GB 21.94% 46.63% 1.32GB 22.35% > sigs.k8s.io/json/internal/golang/encoding/json.(*decodeState).literalStore > 1.06GB 17.96% 64.58% 1.06GB 17.96% > k8s.io/apimachinery/pkg/apis/meta/v1.(*FieldsV1).UnmarshalJSON > 0.88GB 14.87% 79.45% 0.88GB 14.87% reflect.mapassign_faststr0 > {code} > from heap-dump-1255.out: > {code} > flat flat% sum% cum cum% > 1756.18MB 23.53% 23.53% 1756.18MB 23.53% reflect.unsafe_NewArray > 1612.36MB 21.60% 45.13% 1645.86MB 22.05% > sigs.k8s.io/json/internal/golang/encoding/json.(*decodeState).literalStore > 1359.86MB 18.22% 63.35% 1359.86MB 18.22% > k8s.io/apimachinery/pkg/apis/meta/v1.(*FieldsV1).UnmarshalJSON > 1136.40MB 15.22% 78.57% 1136.40MB 15.22% reflect.mapassign_faststr0 > {code} > We also see odd spikes in the # of goroutines but that doesn't seem > correlated with the increase in memory (mainly just mentioning this in case > it's unexpected) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org