[jira] [Commented] (FLINK-22594) Report oom killed pod in k8s resource manager

Aitozi (Jira) Fri, 07 May 2021 05:10:14 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-22594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17340766#comment-17340766
 ]


Aitozi commented on FLINK-22594:
--------------------------------

If container oom killed due to run out of memory, the container status will 
exposed the OOMKilled msg[1], it can be catch in k8s watcher. It can be a guide 
for user to correct the memory of worker.  we can do the improvement by 

1) add a handler to catch the related msg and expose to metric

2) support auto expand the memory of next requested pod of some ratio to let 
the case self-handling

 

[1]: 
https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#exceed-a-container-s-memory-limit

> Report oom killed pod in k8s resource manager
> ---------------------------------------------
>
>                 Key: FLINK-22594
>                 URL: https://issues.apache.org/jira/browse/FLINK-22594
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>            Reporter: Aitozi
>            Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-22594) Report oom killed pod in k8s resource manager

Reply via email to