I found the relevant information on the website. Ill consult with the cluster admin tomorrow, thanks for the help :-)
- Pieter 2016-02-07 19:31 GMT+01:00 Robert Metzger <rmetz...@apache.org>: > Hi, > > we had other users with a similar issue as well. There is a configuration > value which allows you to specify a single port or a range of ports for the > JobManager to allocate when running on YARN. > Note that when using this with a single port, the JMs may collide. > > > > On Sun, Feb 7, 2016 at 7:25 PM, Pieter Hameete <phame...@gmail.com> wrote: > >> Hi Stephan, >> >> surely it seems this way! I must not be the first with this issue though? >> I'll have to contact the cluster admins to find a solution together. What >> would be a way of make the JobManagers accessible from outside the network, >> because the IP and port number changes every time. >> >> Alternatively, I can ask for ssh access to a node within the network. >> that will surely work but it's not my preferred solution. >> >> - Pieter >> >> 2016-02-06 16:22 GMT+01:00 Stephan Ewen <se...@apache.org>: >> >>> Yeah, sounds a lot like the client cannot connect to the JobManager port. >>> >>> The ports to communicate with HDFS and the YARN resource manager may be >>> whitelisted r forwarded, so you can submit the YARN session, but then not >>> connect to the JobManager afterwards. >>> >>> >>> >>> On Sat, Feb 6, 2016 at 2:11 PM, Pieter Hameete <phame...@gmail.com> >>> wrote: >>> >>>> Hi Max! >>>> >>>> I'm using Flink 0.10.1 and indeed the cluster seems to be created fine, >>>> all in the JobManager Web UI looks good. >>>> >>>> It seems like the JobManager initiates the connection with my VM and >>>> cannot reach it. It could be that this is similar to the problem here: >>>> >>>> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/spark-with-docker-errors-with-akka-NAT-td7702.html >>>> >>>> I probably have to make some changes to the networking configuration of >>>> my VM so it can be reached by the JobManager despite using a different port >>>> each time. >>>> >>>> - Pieter >>>> >>>> 2016-02-06 14:05 GMT+01:00 Maximilian Michels <m...@apache.org>: >>>> >>>>> Hi Pieter, >>>>> >>>>> Which version of Flink are you using? It appears you've created a >>>>> Flink YARN cluster but you can't reach the JobManager afterwards. >>>>> >>>>> Cheers, >>>>> Max >>>>> >>>>> On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete <phame...@gmail.com> >>>>> wrote: >>>>> > Hi Robert, >>>>> > >>>>> > unfortunately there are no signs of what is going wrong in the logs. >>>>> The >>>>> > last log messages are about succesful registration of the >>>>> TaskManagers. >>>>> > >>>>> > I'm also fairly sure it must be something in my VM that is causing >>>>> this, >>>>> > because when I start the yarn-session from a login node that is on >>>>> the same >>>>> > network as the hadoop cluster there are no problems registering with >>>>> the >>>>> > JobManager. I did also notice the following message in the local >>>>> console: >>>>> > >>>>> > 12:30:27,173 WARN Remoting >>>>> > - Tried to associate with unreachable remote address >>>>> > [akka.tcp://flink@145.100.41.13:41539]. Address is now gated for >>>>> 5000 ms, >>>>> > all messages to this address will be delivered to dead letters. >>>>> Reason: >>>>> > connection timed out: /145.100.41.13:41539 >>>>> > >>>>> > I can ping the JobManager fine from with VM. Could there be some >>>>> invalid or >>>>> > missing configuration on my side? >>>>> > >>>>> > Cheers, >>>>> > >>>>> > Pieter >>>>> > >>>>> > >>>>> > 2016-02-06 12:54 GMT+01:00 Robert Metzger <rmetz...@apache.org>: >>>>> >> >>>>> >> Hi, >>>>> >> >>>>> >> did you check the logs of the JobManager itself? Maybe it'll tell us >>>>> >> already whats going on. >>>>> >> >>>>> >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete <phame...@gmail.com >>>>> > >>>>> >> wrote: >>>>> >>> >>>>> >>> Hi Guys! >>>>> >>> >>>>> >>> Im attempting to run Flink on YARN, but I run into an issue. Im >>>>> starting >>>>> >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well >>>>> until after >>>>> >>> the JobManager web UI is started: >>>>> >>> >>>>> >>> JobManager web interface address >>>>> >>> >>>>> http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/ >>>>> >>> Waiting until all TaskManagers have connected >>>>> >>> 11:09:51,557 INFO org.apache.flink.yarn.ApplicationClient >>>>> >>> - Notification about new leader address >>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with >>>>> session ID null. >>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>> ... >>>>> >>> 11:09:51,578 INFO org.apache.flink.yarn.ApplicationClient >>>>> >>> - Received address of new leader >>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager with >>>>> session ID null. >>>>> >>> 11:09:51,583 INFO org.apache.flink.yarn.ApplicationClient >>>>> >>> - Disconnect from JobManager null. >>>>> >>> 11:09:51,595 INFO org.apache.flink.yarn.ApplicationClient >>>>> >>> - Trying to register at JobManager >>>>> >>> akka.tcp://flink@145.100.41.148:35666/user/jobmanager. >>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>> ... >>>>> >>> No status updates from the YARN cluster received so far. Waiting >>>>> ... >>>>> >>> >>>>> >>> It then hangs on these last steps (trying to register, no status >>>>> >>> updates..) >>>>> >>> >>>>> >>> Im sure there must be a problem on my side that is causing me not >>>>> to be >>>>> >>> able to register at the JobManager. What could cause such >>>>> connection >>>>> >>> problems? >>>>> >>> >>>>> >>> Any tips are very welcome :-) >>>>> >>> >>>>> >>> Cheers and have a good weekend! >>>>> >>> >>>>> >>> - Pieter >>>>> >>> >>>>> >>> >>>>> >> >>>>> > >>>>> >>>> >>>> >>> >> >