[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-28 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song updated FLINK-24315:
-
Fix Version/s: (was: 1.14.0)
   1.14.1

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.15.0, 1.14.1
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if 
> the API server is temporarily unavailable. However, if the jitter is longer 
> than the web socket timeout, the rebuilding of the watcher will timeout and 
> Flink cannot handle the pod event correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-23 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-24315:
--
Fix Version/s: (was: 1.14.1)
   1.15.0
   1.14.0

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3, 1.15.0
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if 
> the API server is temporarily unavailable. However, if the jitter is longer 
> than the web socket timeout, the rebuilding of the watcher will timeout and 
> Flink cannot handle the pod event correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-24315:
---
Labels: pull-request-available  (was: )

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.3, 1.14.1
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if 
> the API server is temporarily unavailable. However, if the jitter is longer 
> than the web socket timeout, the rebuilding of the watcher will timeout and 
> Flink cannot handle the pod event correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-17 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-24315:
---
Description: 
In native k8s integration, Flink will try to rebuild the watcher thread if the 
API server is temporarily unavailable. However, if the jitter is longer than 
the web socket timeout, the rebuilding of the watcher will timeout and Flink 
cannot handle the pod event correctly.


  was:
Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
-i 'websocket'" to check the watcher thread is exists.

The watcher thread of k8s client will be closed while 


> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
>
> In native k8s integration, Flink will try to rebuild the watcher thread if 
> the API server is temporarily unavailable. However, if the jitter is longer 
> than the web socket timeout, the rebuilding of the watcher will timeout and 
> Flink cannot handle the pod event correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-17 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-24315:
---
Description: 
Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
-i 'websocket'" to check the watcher thread is exists.

The watcher thread of k8s client will be closed while 

  was:Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
-i 'websocket'" to check the watcher thread is exists.


> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
>
> Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
> network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
> -i 'websocket'" to check the watcher thread is exists.
> The watcher thread of k8s client will be closed while 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-17 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-24315:
---
Affects Version/s: (was: 1.14.1)

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
>
> Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
> network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
> -i 'websocket'" to check the watcher thread is exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-17 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-24315:
---
Fix Version/s: (was: 1.13.2)
   (was: 1.14.0)
   1.13.3

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2, 1.14.1
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.13.3, 1.14.1
>
>
> Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
> network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
> -i 'websocket'" to check the watcher thread is exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24315) Cannot rebuild watcher thread while the K8S API server is unavailable

2021-09-17 Thread Yangze Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo updated FLINK-24315:
---
Summary: Cannot rebuild watcher thread while the K8S API server is 
unavailable  (was: Flink native on k8s wacther thread will down,when k8s api 
server not work or network timeout)

> Cannot rebuild watcher thread while the K8S API server is unavailable
> -
>
> Key: FLINK-24315
> URL: https://issues.apache.org/jira/browse/FLINK-24315
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.0, 1.13.2, 1.14.1
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.14.0, 1.13.2, 1.14.1
>
>
> Jobmanager use fabric-client to watch api-server.When k8s api-server  or 
> network problems. The watcher thread will closed ,  can use "jstack 1 && grep 
> -i 'websocket'" to check the watcher thread is exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)