[ 
https://issues.apache.org/jira/browse/FLINK-35192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841129#comment-17841129
 ] 

Biao Geng commented on FLINK-35192:
-----------------------------------

 !screenshot-3.png! 
According to the flink k8s op's codes, the deleteOnExit() is called when create 
config files or pod template files. It looks like that it is possible to lead 
the memory leak if the operator pod runs for a long time. In the operator's 
FlinkConfigManager implementation, we would clean up these temp files/dirs. 
Maybe we can safely remove the deleteOnExit() usage? cc [~gyfora]

Also, from the attached yaml, it looks like a custom flink k8s op 
image(gdc-flink-kubernetes-operator:1.6.1-GDC1.0.2) is used.  [~stupid_pig] 
would you mind checking if your codes call methods like deleteOnExit if you 
have some customized changes to the operator?

> operator oom
> ------------
>
>                 Key: FLINK-35192
>                 URL: https://issues.apache.org/jira/browse/FLINK-35192
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.6.1
>         Environment: jdk: openjdk11
> operator version: 1.6.1
>            Reporter: chenyuzhi
>            Priority: Major
>         Attachments: image-2024-04-22-15-47-49-455.png, 
> image-2024-04-22-15-52-51-600.png, image-2024-04-22-15-58-23-269.png, 
> image-2024-04-22-15-58-42-850.png, screenshot-1.png, screenshot-2.png, 
> screenshot-3.png
>
>
> The kubernetest operator docker process was killed by kernel cause out of 
> memory(the time is 2024.04.03: 18:16)
>  !image-2024-04-22-15-47-49-455.png! 
> Metrics:
> the pod memory (RSS) is increasing slowly in the past 7 days:
>  !screenshot-1.png! 
> However the jvm memory metrics of operator not shown obvious anomaly:
>  !image-2024-04-22-15-58-23-269.png! 
>  !image-2024-04-22-15-58-42-850.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to