Thanks for response; And sorry the passed time. The JobManager & TaskManager logged ports are open!
Is this log OK? 2018-01-15 13:40:03,455 INFO org.apache.flink.runtime.webmonitor.JobManagerRetriever - New leader reachable under akka.tcp://[email protected]:6123/user/jobmanager:null. When I kill task-manger, the jobmanager logs: 2018-01-15 13:32:41,419 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@stage_dbq_1:45532] has failed, address is now gated for [5000] ms. Reason: [Disassociated] But it will not decrement the number of available task-managers! and when I start my signle task-manager again, it logs: 2018-01-15 13:32:52,753 INFO org.apache.flink.runtime.instance.InstanceManager - Registered TaskManager at ??? (akka://flink/deadLetters) as 626846ae27a833cb094eeeb047a6a72c. Current number of registered hosts is 2. Current number of alive task slots is 40. On Wed, Jan 10, 2018 at 11:36 AM, Piotr Nowojski <[email protected]> wrote: > Hi, > > Search both job manager and task manager logs for ip address(es) and > port(s) that have timeouted. First of all make sure that nodes are visible > to each other using some simple ping. Afterwards please check that those > timeouted ports are opened and not blocked by some firewall (telnet). > > You can search the documentation for the configuration parameters with > “port” in name: > https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/config.html > But note that many of them are random by default. > > Piotrek > > On 9 Jan 2018, at 17:56, Reza Samee <[email protected]> wrote: > > > I'm running a flink-cluster (a mini one with just one node); but the > problem is that my TaskManager can't reach to my JobManager! > > Here are logs from TaskManager > ... > Trying to register at JobManager akka.tcp://flink@MY_PRIV_IP/ > user/jobmanager (attempt 20, timeout: 30 seconds) > Trying to register at JobManager akka.tcp://flink@MY_PRIV_IP/ > user/jobmanager (attempt 21, timeout: 30 seconds) > Trying to register at JobManager akka.tcp://flink@MY_PRIV_IP/ > user/jobmanager (attempt 22, timeout: 30 seconds) > Trying to register at JobManager akka.tcp://flink@MY_PRIV_IP/ > user/jobmanager (attempt 23, timeout: 30 seconds) > Trying to register at JobManager akka.tcp://flink@MY_PRIV_IP/ > user/jobmanager (attempt 24, timeout: 30 seconds) > ... > > My "JobManager UI" shows my TaskManager with this Path & ID: " > akka://flink/deadLetters" ( in TaskManagers tab) > And I found these lines in my JobManger stdout: > > Resource Manager associating with leading JobManager Actor[ > akka://flink/user/jobmanager#-275619168] - leader session null > TaskManager ResourceID{resourceId='1132cbdaf2d8204e5e42e321e8592754'} has > started. > Registered TaskManager at MY_PRIV_IP (akka://flink/deadLetters) as > 7d9568445b4557a74d05a0771a08ad9c. Current number of registered hosts is > 1. Current number of alive task slots is 20. > > > What's the meaning of these lines? Where should I look for the solution? > > > > > -- > رضا سامعی / http://samee.blog.ir > > > -- رضا سامعی / http://samee.blog.ir
