[ 
https://issues.apache.org/jira/browse/FLINK-30036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635064#comment-17635064
 ] 

Yang Wang commented on FLINK-30036:
-----------------------------------

After more investigation, it seems that the terminating pods are counted into 
the used quota. Then I think this ticket is a valid issue. We may need a config 
option to enable force-delete when the pod might block at terminating(e.g. node 
not ready).

I have one more concern that node not ready does not always mean the pod will 
block at terminating status. Force delete will send a SIGKILL to pod and the TM 
will not have the chance for the clean-up.

> Force delete pod when  k8s node is not ready
> --------------------------------------------
>
>                 Key: FLINK-30036
>                 URL: https://issues.apache.org/jira/browse/FLINK-30036
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>            Reporter: Peng Yuan
>            Priority: Major
>              Labels: pull-request-available
>
> When the K8s node is in the NotReady state, the taskmanager pod scheduled on 
> it is always in the terminating state. When the flink cluster has a strict 
> quota, the terminating pod will hold the resources all the time. As a result, 
> the new taskmanager pod cannot apply for resources and cannot be started.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to