[
https://issues.apache.org/jira/browse/FLINK-36451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Pohl resolved FLINK-36451.
-----------------------------------
Resolution: Fixed
> Kubernetes Application JobManager Potential Deadlock and TaskManager Pod
> Residuals
> ----------------------------------------------------------------------------------
>
> Key: FLINK-36451
> URL: https://issues.apache.org/jira/browse/FLINK-36451
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.19.1
> Environment: * Flink version: 1.19.1
> * - Deployment mode: Flink Kubernetes Application Mode
> * - JVM version: OpenJDK 17
>
> Reporter: xiechenling
> Assignee: Matthias Pohl
> Priority: Major
> Labels: pull-request-available
> Attachments: 1.png, 2.png, jobmanager.log, jstack.txt
>
>
> In Kubernetes Application Mode, when there is significant etcd latency or
> instability, the Flink JobManager may enter a deadlock situation.
> Additionally, TaskManager pods are not cleaned up properly, resulting in
> stale resources that prevent the Flink job from recovering correctly. This
> issue occurs during frequent service restarts or network instability.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)