Ive tried setting the yarn.application-master.port property in flink-conf.yaml to a range suggested in https://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html#running-flink-on-yarn-behind-fi rewalls
The JobManager does not seem to be picking the property up. Am I setting this in the wrong place? Or is there another way to enforce this property? Cheers, Pieter 2016-02-07 20:04 GMT+01:00 Pieter Hameete <phame...@gmail.com>: > I found the relevant information on the website. Ill consult with the > cluster admin tomorrow, thanks for the help :-) > > - Pieter > > 2016-02-07 19:31 GMT+01:00 Robert Metzger <rmetz...@apache.org>: > >> Hi, >> >> we had other users with a similar issue as well. There is a configuration >> value which allows you to specify a single port or a range of ports for the >> JobManager to allocate when running on YARN. >> Note that when using this with a single port, the JMs may collide. >> >> >> >> On Sun, Feb 7, 2016 at 7:25 PM, Pieter Hameete <phame...@gmail.com> >> wrote: >> >>> Hi Stephan, >>> >>> surely it seems this way! I must not be the first with this issue >>> though? I'll have to contact the cluster admins to find a solution >>> together. What would be a way of make the JobManagers accessible from >>> outside the network, because the IP and port number changes every time. >>> >>> Alternatively, I can ask for ssh access to a node within the network. >>> that will surely work but it's not my preferred solution. >>> >>> - Pieter >>> >>> 2016-02-06 16:22 GMT+01:00 Stephan Ewen <se...@apache.org>: >>> >>>> Yeah, sounds a lot like the client cannot connect to the JobManager >>>> port. >>>> >>>> The ports to communicate with HDFS and the YARN resource manager may be >>>> whitelisted r forwarded, so you can submit the YARN session, but then not >>>> connect to the JobManager afterwards. >>>> >>>> >>>> >>>> On Sat, Feb 6, 2016 at 2:11 PM, Pieter Hameete <phame...@gmail.com> >>>> wrote: >>>> >>>>> Hi Max! >>>>> >>>>> I'm using Flink 0.10.1 and indeed the cluster seems to be created >>>>> fine, all in the JobManager Web UI looks good. >>>>> >>>>> It seems like the JobManager initiates the connection with my VM and >>>>> cannot reach it. It could be that this is similar to the problem here: >>>>> >>>>> >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/spark-with-docker-errors-with-akka-NAT-td7702.html >>>>> >>>>> I probably have to make some changes to the networking configuration >>>>> of my VM so it can be reached by the JobManager despite using a different >>>>> port each time. >>>>> >>>>> - Pieter >>>>> >>>>> 2016-02-06 14:05 GMT+01:00 Maximilian Michels <m...@apache.org>: >>>>> >>>>>> Hi Pieter, >>>>>> >>>>>> Which version of Flink are you using? It appears you've created a >>>>>> Flink YARN cluster but you can't reach the JobManager afterwards. >>>>>> >>>>>> Cheers, >>>>>> Max >>>>>> >>>>>> On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete <phame...@gmail.com> >>>>>> wrote: >>>>>> > Hi Robert, >>>>>> > >>>>>> > unfortunately there are no signs of what is going wrong in the >>>>>> logs. The >>>>>> > last log messages are about succesful registration of the >>>>>> TaskManagers. >>>>>> > >>>>>> > I'm also fairly sure it must be something in my VM that is causing >>>>>> this, >>>>>> > because when I start the yarn-session from a login node that is on >>>>>> the same >>>>>> > network as the hadoop cluster there are no problems registering >>>>>> with the >>>>>> > JobManager. I did also notice the following message in the local >>>>>> console: >>>>>> > >>>>>> > 12:30:27,173 WARN Remoting >>>>>> > - Tried to associate with unreachable remote address >>>>>> > [akka.tcp://flink@145.100.41.13:41539]. Address is now gated for >>>>>> 5000 ms, >>>>>> > all messages to this address will be delivered to dead letters. >>>>>> Reason: >>>>>> > connection timed out: /145.100.41.13:41539 >>>>>> > >>>>>> > I can ping the JobManager fine from with VM. Could there be some >>>>>> invalid or >>>>>> > missing configuration on my side? >>>>>> > >>>>>> > Cheers, >>>>>> > >>>>>> > Pieter >>>>>> > >>>>>> > >>>>>> > 2016-02-06 12:54 GMT+01:00 Robert Metzger <rmetz...@apache.org>: >>>>>> >> >>>>>> >> Hi, >>>>>> >> >>>>>> >> did you check the logs of the JobManager itself? Maybe it'll tell >>>>>> us >>>>>> >> already whats going on. >>>>>> >> >>>>>> >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete < >>>>>> phame...@gmail.com> >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> Hi Guys! >>>>>> >>> >>>>>> >>> Im attempting to run Flink on YARN, but I run into an issue. Im >>>>>> starting >>>>>> >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well >>>>>> until after >>>>>> >>> the JobManager web UI is started: >>>>>> >>> >>>>>> >>> JobManager web interface address >>>>>> >>> >>>>>> http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/ >>>>>> >>> Waiting until all TaskManagers have connected >>>>>> >>> 11:09:51,557 INFO org.apache.flink.yarn.ApplicationClient >>>>>> >>> - Notification about new leader address >>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with >>>>>> session ID null. >>>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>>> ... >>>>>> >>> 11:09:51,578 INFO org.apache.flink.yarn.ApplicationClient >>>>>> >>> - Received address of new leader >>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with >>>>>> session ID null. >>>>>> >>> 11:09:51,583 INFO org.apache.flink.yarn.ApplicationClient >>>>>> >>> - Disconnect from JobManager null. >>>>>> >>> 11:09:51,595 INFO org.apache.flink.yarn.ApplicationClient >>>>>> >>> - Trying to register at JobManager >>>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager. >>>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>>> ... >>>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>>> ... >>>>>> >>> >>>>>> >>> It then hangs on these last steps (trying to register, no status >>>>>> >>> updates..) >>>>>> >>> >>>>>> >>> Im sure there must be a problem on my side that is causing me not >>>>>> to be >>>>>> >>> able to register at the JobManager. What could cause such >>>>>> connection >>>>>> >>> problems? >>>>>> >>> >>>>>> >>> Any tips are very welcome :-) >>>>>> >>> >>>>>> >>> Cheers and have a good weekend! >>>>>> >>> >>>>>> >>> - Pieter >>>>>> >>> >>>>>> >>> >>>>>> >> >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> >