[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.

2019-06-04 Thread ChuanHaiTan (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855422#comment-16855422
 ] 

ChuanHaiTan commented on FLINK-10775:
-

Many thanks ,Till Rohrmann!Sorry to reply you so long ...

 

We had solved it by  writing  “XXX.XXX.XXX.XX(service ip)  jobmanager”    into  
  the  profile "/etc/hosts" of taskmanager ,then it worked.

 

Butnow.,  do first above ,taskmanager works.    Bu  4 mins  later,  the 
 config “XXX.XXX.XXX.XX(service ip)  jobmanager”   in   "/etc/hosts" of 
taskmanager  is disappeared,  then  taskmanager  is lost    expectedly.   
Why???      it's  still on flink  1.4.2. 

 

 

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still 
> unreachable or has not been restarted. Keeping it quarantined.
> 
>
> Key: FLINK-10775
> URL: https://issues.apache.org/jira/browse/FLINK-10775
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.4.2
> Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>Reporter: ChuanHaiTan
>Priority: Blocker
>  Labels: k8s+docker, usability
> Attachments: 
> logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 
> 微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager 
> container are the standalone cluster modes.
> {color:#FF}But for some reason, the jobmanager is rebooted, and two of 
> the remaining three taskmanger are also rebooted, and two of the remaining 
> three taskmanger don't connect to jobmanager, resulting in insufficient slot 
> resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and 
> all available and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and 
> one of the three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.

2019-02-21 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774177#comment-16774177
 ] 

Till Rohrmann commented on FLINK-10775:
---

Yes, I will close this issue since we no longer support Flink 1.4.

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still 
> unreachable or has not been restarted. Keeping it quarantined.
> 
>
> Key: FLINK-10775
> URL: https://issues.apache.org/jira/browse/FLINK-10775
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.4.2
> Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>Reporter: ChuanHaiTan
>Priority: Blocker
>  Labels: k8s+docker, usability
> Attachments: 
> logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 
> 微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager 
> container are the standalone cluster modes.
> {color:#FF}But for some reason, the jobmanager is rebooted, and two of 
> the remaining three taskmanger are also rebooted, and two of the remaining 
> three taskmanger don't connect to jobmanager, resulting in insufficient slot 
> resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and 
> all available and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and 
> one of the three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.

2019-02-21 Thread Stefan Richter (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773853#comment-16773853
 ] 

Stefan Richter commented on FLINK-10775:


[~till.rohrmann] can this issue be closed then?

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still 
> unreachable or has not been restarted. Keeping it quarantined.
> 
>
> Key: FLINK-10775
> URL: https://issues.apache.org/jira/browse/FLINK-10775
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.4.2
> Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>Reporter: ChuanHaiTan
>Priority: Blocker
>  Labels: k8s+docker, usability
> Attachments: 
> logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 
> 微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager 
> container are the standalone cluster modes.
> {color:#FF}But for some reason, the jobmanager is rebooted, and two of 
> the remaining three taskmanger are also rebooted, and two of the remaining 
> three taskmanger don't connect to jobmanager, resulting in insufficient slot 
> resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and 
> all available and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and 
> one of the three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.

2018-12-02 Thread miki haiat (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706252#comment-16706252
 ] 

miki haiat commented on FLINK-10775:


I had this issue as well on 1.4.x .

I can confirm that on 1.5.5 and 1.6.x this issue is no longer exists 

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still 
> unreachable or has not been restarted. Keeping it quarantined.
> 
>
> Key: FLINK-10775
> URL: https://issues.apache.org/jira/browse/FLINK-10775
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.4.2
> Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>Reporter: ChuanHaiTan
>Priority: Blocker
>  Labels: k8s+docker, usability
> Attachments: 
> logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 
> 微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager 
> container are the standalone cluster modes.
> {color:#FF}But for some reason, the jobmanager is rebooted, and two of 
> the remaining three taskmanger are also rebooted, and two of the remaining 
> three taskmanger don't connect to jobmanager, resulting in insufficient slot 
> resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and 
> all available and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and 
> one of the three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.

2018-11-08 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679529#comment-16679529
 ] 

Till Rohrmann commented on FLINK-10775:
---

Have you tried using Flink's latest version {{1.5.5}} or {{1.6.2}}? The 
community no longer supports {{1.4.x}}.

> Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still 
> unreachable or has not been restarted. Keeping it quarantined.
> 
>
> Key: FLINK-10775
> URL: https://issues.apache.org/jira/browse/FLINK-10775
> Project: Flink
>  Issue Type: Bug
>  Components: ResourceManager
>Affects Versions: 1.4.2
> Environment: k8s+docker 
> standalone (1jobmanager + 5taskmanager)
> taskmanager.slotnum=4
>Reporter: ChuanHaiTan
>Priority: Blocker
>  Labels: k8s+docker, usability
> Attachments: 
> logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, 
> logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 
> 微信图片_20181031171312.png, 微信图片_20181031171316.png
>
>
> On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager 
> container are the standalone cluster modes.
> {color:#FF}But for some reason, the jobmanager is rebooted, and two of 
> the remaining three taskmanger are also rebooted, and two of the remaining 
> three taskmanger don't connect to jobmanager, resulting in insufficient slot 
> resources reporting errors.{color}
> The attachments are the jobmanager log, two disconnected taskmanger logs, and 
> all available and unavailable taskmanager screenshots of flink at the time.
> It is strange that two rebooted taskmanger can connect with jobmanager, and 
> one of the three unrebooted taskamanagers can connect.
> Why?Can the cause of the restart be analyzed from the log?thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)