Hi, thank you for your reply - as you mentioned the issue was in Storm binding to wrong interface in VirtualBox during testing of automated cluster setup. After setup on bare metal testing cluster everything works correctly.
Thanks 2015-02-06 18:41 GMT+01:00 Tomas Barton <[email protected]>: > Hi, > > sorry for late reply. I found the message accidentally in spam. > > It seems like Storm is binding to localhost 127.0.1.1:52310 > <http://[email protected]:52310/> > instead > of using public interface. > > Regards, > Tomas > > > On 19 January 2015 at 14:04, Ondrej Smola <[email protected]> wrote: > >> Hi, >> >> we have Mesos cluster installation - 3 masters (0.21.0), ZK (3.4.5) >> running Mesos, Spark, Chronos, Marathon and Storm 0.9.3. All nodes running >> Ubuntu 14.04. >> >> My problem is that i have to start MesosNimbus on currently elected >> leader, otherwise MesosNimbus get stuck. From log i see it detects >> currently leading master correctly but then get stuck. When leader changes >> to node running nimbus it works again. >> >> nimbus upstrart.log >> >> I0119 12:20:03.289799 10728 detector.cpp:433] A new leading master (UPID= >> [email protected]:5050) is detected >> I0119 12:20:03.290081 10733 sched.cpp:234] New master detected at >> [email protected]:5050 >> I0119 12:20:03.290592 10733 sched.cpp:242] No credentials provided. >> Attempting to register without authentication >> >> nimbus.log >> >> 2015-01-19T12:15:40.478+0100 o.m.log [DEBUG] started Server@20e1ceb3 >> 2015-01-19T12:15:40.478+0100 s.m.MesosNimbus [INFO] Started serving >> config dir under http://192.168.56.10:49202/conf >> 2015-01-19T12:15:40.535+0100 s.m.MesosNimbus [INFO] Waiting for scheduler >> to initialize... >> >> On leading mesos i see following log (repeated every second) >> >> mesos.log >> >> I0119 12:40:53.208027 4957 master.cpp:1520] Received re-registration >> request from framework 20150119-114412-171485376-5050-6660-0002 (Storm >> 0.9.3) at [email protected]:52310 >> I0119 12:40:53.208860 4957 master.cpp:1573] Re-registering framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 >> I0119 12:40:53.209205 4957 master.cpp:1602] Framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 failed >> over >> I0119 12:40:53.211552 4957 hierarchical_allocator_process.hpp:375] >> Activated framework 20150119-114412-171485376-5050-6660-0002 >> I0119 12:40:53.211932 4959 master.cpp:789] Framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 >> disconnected >> I0119 12:40:53.212004 4959 master.cpp:1752] Disconnecting framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 >> I0119 12:40:53.212198 4959 master.cpp:1768] Deactivating framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 >> I0119 12:40:53.212446 4959 master.cpp:811] Giving framework >> 20150119-114412-171485376-5050-6660-0002 (Storm 0.9.3) at >> [email protected]:52310 1hrs to >> failover >> I0119 12:40:53.212550 4959 hierarchical_allocator_process.hpp:405] >> Deactivated framework 20150119-114412-171485376-5050-6660-0002 >> I0119 12:40:54.209858 4959 master.cpp:1520] Received re-registration >> request from framework 20150119-114412-171485376-5050-6660-0002 (Storm >> 0.9.3) at [email protected]:52310 >> >> >> Other frameworks works okay and handles leading masters on another node >> correctly. >> From breef look at source code it hangs >> >> >> https://github.com/mesos/storm/blob/master/src/storm/mesos/MesosNimbus.java >> at line 153 >> >> when trying to acquire semaphore. >> >> >> Thank you for your great job >> >> Ondrej Smola >> > >

