[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
[ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855422#comment-16855422 ] ChuanHaiTan commented on FLINK-10775: - Many thanks ,Till Rohrmann!Sorry to reply you so long ... We had solved it by writing “XXX.XXX.XXX.XX(service ip) jobmanager” into the profile "/etc/hosts" of taskmanager ,then it worked. Butnow., do first above ,taskmanager works. Bu 4 mins later, the config “XXX.XXX.XXX.XX(service ip) jobmanager” in "/etc/hosts" of taskmanager is disappeared, then taskmanager is lost expectedly. Why??? it's still on flink 1.4.2. > Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still > unreachable or has not been restarted. Keeping it quarantined. > > > Key: FLINK-10775 > URL: https://issues.apache.org/jira/browse/FLINK-10775 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.4.2 > Environment: k8s+docker > standalone (1jobmanager + 5taskmanager) > taskmanager.slotnum=4 >Reporter: ChuanHaiTan >Priority: Blocker > Labels: k8s+docker, usability > Attachments: > logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, > 微信图片_20181031171312.png, 微信图片_20181031171316.png > > > On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager > container are the standalone cluster modes. > {color:#FF}But for some reason, the jobmanager is rebooted, and two of > the remaining three taskmanger are also rebooted, and two of the remaining > three taskmanger don't connect to jobmanager, resulting in insufficient slot > resources reporting errors.{color} > The attachments are the jobmanager log, two disconnected taskmanger logs, and > all available and unavailable taskmanager screenshots of flink at the time. > It is strange that two rebooted taskmanger can connect with jobmanager, and > one of the three unrebooted taskamanagers can connect. > Why?Can the cause of the restart be analyzed from the log?thank you -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
[ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774177#comment-16774177 ] Till Rohrmann commented on FLINK-10775: --- Yes, I will close this issue since we no longer support Flink 1.4. > Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still > unreachable or has not been restarted. Keeping it quarantined. > > > Key: FLINK-10775 > URL: https://issues.apache.org/jira/browse/FLINK-10775 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.4.2 > Environment: k8s+docker > standalone (1jobmanager + 5taskmanager) > taskmanager.slotnum=4 >Reporter: ChuanHaiTan >Priority: Blocker > Labels: k8s+docker, usability > Attachments: > logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, > 微信图片_20181031171312.png, 微信图片_20181031171316.png > > > On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager > container are the standalone cluster modes. > {color:#FF}But for some reason, the jobmanager is rebooted, and two of > the remaining three taskmanger are also rebooted, and two of the remaining > three taskmanger don't connect to jobmanager, resulting in insufficient slot > resources reporting errors.{color} > The attachments are the jobmanager log, two disconnected taskmanger logs, and > all available and unavailable taskmanager screenshots of flink at the time. > It is strange that two rebooted taskmanger can connect with jobmanager, and > one of the three unrebooted taskamanagers can connect. > Why?Can the cause of the restart be analyzed from the log?thank you -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
[ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773853#comment-16773853 ] Stefan Richter commented on FLINK-10775: [~till.rohrmann] can this issue be closed then? > Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still > unreachable or has not been restarted. Keeping it quarantined. > > > Key: FLINK-10775 > URL: https://issues.apache.org/jira/browse/FLINK-10775 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.4.2 > Environment: k8s+docker > standalone (1jobmanager + 5taskmanager) > taskmanager.slotnum=4 >Reporter: ChuanHaiTan >Priority: Blocker > Labels: k8s+docker, usability > Attachments: > logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, > 微信图片_20181031171312.png, 微信图片_20181031171316.png > > > On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager > container are the standalone cluster modes. > {color:#FF}But for some reason, the jobmanager is rebooted, and two of > the remaining three taskmanger are also rebooted, and two of the remaining > three taskmanger don't connect to jobmanager, resulting in insufficient slot > resources reporting errors.{color} > The attachments are the jobmanager log, two disconnected taskmanger logs, and > all available and unavailable taskmanager screenshots of flink at the time. > It is strange that two rebooted taskmanger can connect with jobmanager, and > one of the three unrebooted taskamanagers can connect. > Why?Can the cause of the restart be analyzed from the log?thank you -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
[ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706252#comment-16706252 ] miki haiat commented on FLINK-10775: I had this issue as well on 1.4.x . I can confirm that on 1.5.5 and 1.6.x this issue is no longer exists > Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still > unreachable or has not been restarted. Keeping it quarantined. > > > Key: FLINK-10775 > URL: https://issues.apache.org/jira/browse/FLINK-10775 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.4.2 > Environment: k8s+docker > standalone (1jobmanager + 5taskmanager) > taskmanager.slotnum=4 >Reporter: ChuanHaiTan >Priority: Blocker > Labels: k8s+docker, usability > Attachments: > logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, > 微信图片_20181031171312.png, 微信图片_20181031171316.png > > > On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager > container are the standalone cluster modes. > {color:#FF}But for some reason, the jobmanager is rebooted, and two of > the remaining three taskmanger are also rebooted, and two of the remaining > three taskmanger don't connect to jobmanager, resulting in insufficient slot > resources reporting errors.{color} > The attachments are the jobmanager log, two disconnected taskmanger logs, and > all available and unavailable taskmanager screenshots of flink at the time. > It is strange that two rebooted taskmanger can connect with jobmanager, and > one of the three unrebooted taskamanagers can connect. > Why?Can the cause of the restart be analyzed from the log?thank you -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10775) Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
[ https://issues.apache.org/jira/browse/FLINK-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679529#comment-16679529 ] Till Rohrmann commented on FLINK-10775: --- Have you tried using Flink's latest version {{1.5.5}} or {{1.6.2}}? The community no longer supports {{1.4.x}}. > Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still > unreachable or has not been restarted. Keeping it quarantined. > > > Key: FLINK-10775 > URL: https://issues.apache.org/jira/browse/FLINK-10775 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.4.2 > Environment: k8s+docker > standalone (1jobmanager + 5taskmanager) > taskmanager.slotnum=4 >Reporter: ChuanHaiTan >Priority: Blocker > Labels: k8s+docker, usability > Attachments: > logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, > logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, > 微信图片_20181031171312.png, 微信图片_20181031171316.png > > > On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager > container are the standalone cluster modes. > {color:#FF}But for some reason, the jobmanager is rebooted, and two of > the remaining three taskmanger are also rebooted, and two of the remaining > three taskmanger don't connect to jobmanager, resulting in insufficient slot > resources reporting errors.{color} > The attachments are the jobmanager log, two disconnected taskmanger logs, and > all available and unavailable taskmanager screenshots of flink at the time. > It is strange that two rebooted taskmanger can connect with jobmanager, and > one of the three unrebooted taskamanagers can connect. > Why?Can the cause of the restart be analyzed from the log?thank you -- This message was sent by Atlassian JIRA (v7.6.3#76005)