Ufuk Celebi created FLINK-2970:
----------------------------------
Summary: Yarn client cannot connect to new job manager
Key: FLINK-2970
URL: https://issues.apache.org/jira/browse/FLINK-2970
Project: Flink
Issue Type: Bug
Components: Distributed Runtime, YARN Client
Affects Versions: 0.10
Reporter: Ufuk Celebi
I'm running a YARN session with 2 physical nodes and 5 containers
(ApplicationMaster and 4 TaskManagers). There is no Flink program submitted to
the cluster.
Running a sequence of failure operations (killing the ApplicationMaster and
TaskManager containers), I sometimes get an infinite loop of
{code}
15:45:29,719 WARN Remoting
- Tried to associate with unreachable remote address
[akka.tcp://[email protected]:58926]. Address is now gated for 5000 ms, all
messages to this address will be delivered to dead letters. Reason: Connection
refused: /10.240.0.3:58926
{code}
I see that the ApplicationMaster container has been started though.
I would not block the RC on this and address it for 0.10.1 or 1.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)