[jira] [Commented] (FLINK-30036) Force delete pod when k8s node is not ready

Peng Yuan (Jira) Thu, 01 Dec 2022 23:31:06 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-30036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17642330#comment-17642330
 ]


Peng Yuan commented on FLINK-30036:
-----------------------------------

TM Pod in terminating status:
!https://intranetproxy.alipay.com/skylark/lark/0/2022/png/44456401/1669954379685-387ff2de-2e29-4b43-9ddf-90ca1b185671.png|width=1429,id=uec269415!
The node which pod on it's conditions are:
!https://intranetproxy.alipay.com/skylark/lark/0/2022/png/44456401/1669953309755-6d6d95b9-bdf3-4c0c-bb55-d13a459c8dbc.png?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0!
!https://intranetproxy.alipay.com/skylark/lark/0/2022/png/44456401/1669953318718-81b4db9a-1bc1-46b7-8f94-792819c4d277.png?x-oss-process=image%2Fresize%2Cw_1500%2Climit_0!
 We can see when the kubelet can not post node status, the node status is 
Unknown.
 

> Force delete pod when  k8s node is not ready
> --------------------------------------------
>
>                 Key: FLINK-30036
>                 URL: https://issues.apache.org/jira/browse/FLINK-30036
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>            Reporter: Peng Yuan
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-11-17-10-25-59-945.png
>
>
> When the K8s node is in the NotReady state, the taskmanager pod scheduled on 
> it is always in the terminating state. When the flink cluster has a strict 
> quota, the terminating pod will hold the resources all the time. As a result, 
> the new taskmanager pod cannot apply for resources and cannot be started.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-30036) Force delete pod when k8s node is not ready

Reply via email to