[
https://issues.apache.org/jira/browse/FLINK-27576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17540608#comment-17540608
]
Aitozi commented on FLINK-27576:
--------------------------------
Hi [~zhisheng] I have opened a PR for this, please let me know if you have any
suggestion, thanks
https://github.com/apache/flink/pull/19786
> Flink will request new pod when jm pod is delete, but will remove when
> TaskExecutor exceeded the idle timeout
> --------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-27576
> URL: https://issues.apache.org/jira/browse/FLINK-27576
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.12.0
> Reporter: zhisheng
> Priority: Major
> Attachments: image-2022-05-11-20-06-58-955.png,
> image-2022-05-11-20-08-01-739.png, jobmanager_log.txt
>
>
> flink 1.12.0 enable the ha(zk) and checkpoint, when i use kubectl delete the
> jm pod, the job will request new jm pod failover from the last checkpoint ,
> it is ok. But it will request new tm pod again, but not use actually, the
> new tm pod will closed when TaskExecutor exceeded the idle timeout . actually
> it will use the old tm, why need to request for new tm pod? whether the job
> will fail if the cluster has no resource for the new tm?Can we optimize and
> reuse the old tm directly?
>
> [^jobmanager_log.txt]
> ^!image-2022-05-11-20-06-58-955.png!^
> ^!image-2022-05-11-20-08-01-739.png|width=857,height=324!^
--
This message was sent by Atlassian Jira
(v8.20.7#820007)