[
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yangze Guo updated FLINK-24315:
-------------------------------
Description:
In native k8s integration, Flink will try to rebuild the watcher thread if the
API server is temporarily unavailable. However, if the jitter is longer than
the web socket timeout, the rebuilding of the watcher will timeout and Flink
cannot handle the pod event correctly.
was:
Jobmanager use fabric-client to watch api-server.When k8s api-server or
network problems. The watcher thread will closed , can use "jstack 1 && grep
-i 'websocket'" to check the watcher thread is exists.
The watcher thread of k8s client will be closed while
> Cannot rebuild watcher thread while the K8S API server is unavailable
> ---------------------------------------------------------------------
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.14.0, 1.13.2
> Reporter: ouyangwulin
> Priority: Major
> Fix For: 1.13.3, 1.14.1
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if
> the API server is temporarily unavailable. However, if the jitter is longer
> than the web socket timeout, the rebuilding of the watcher will timeout and
> Flink cannot handle the pod event correctly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)