currently mesos should be in same network, don't expose public ip. 2015-10-16 5:59 GMT+08:00 Ahmet Emre Aladağ <[email protected]>:
> I figured out the reason. > > I had configured mesos with internal IPs. I see that Zookeeper broadcasts > external network IP to the masters/slaves. There was a firewall issue with > the public IP. So that's the problem. > > Is it the correct way for Zookeeper to broadcast the public IPs? That's > understandable for cases where we extend the cluster out of the network. > > On Thu, Oct 15, 2015 at 6:49 PM, Ahmet Emre Aladağ <[email protected]> > wrote: > >> Hi all, >> >> I'm trying to build a mesos cluster with mesosphere 0.25. >> >> When I run 3 mesos-master node with QUORUM=2, one is elected as the >> leader, 1 minute later the leader gives the error messages below, then >> restarts. Upon restart, they make another election. They keep electing one >> another in a loop, consistently failing, restarting and re-electing. If I >> set QUORUM=1, leader becomes stable. But slaves can't connect masters. What >> could be the reason for this connection problem? >> >> Marathon console thinks node 1 is the leader although mesos panel shows >> node 3 is the leader. >> >> I also tried running slaves on the same nodes as masters but they >> encountered the same error and slaves are not recognized by the masters. >> >> >> Thanks, >> >> MASTER ERRORS: >> >> E1015 11:50:35.539562 19150 socket.hpp:174] Shutdown failed on fd=25: >> Transport endpoint is not connected [107] >> >> E1015 11:50:35.539897 19150 socket.hpp:174] Shutdown failed on fd=24: >> Transport endpoint is not connected [107] >> >> >> SLAVE ERRORS: >> >> E1015 15:17:53.232672 25191 socket.hpp:174] Shutdown failed on fd=10: >> Transport endpoint is not connected [107] >> >> E1015 15:18:01.424705 25191 socket.hpp:174] Shutdown failed on fd=11: >> Transport endpoint is not connected [107] >> >> E1015 15:19:09.392596 25191 socket.hpp:174] Shutdown failed on fd=12: >> Transport endpoint is not connected [107] >> >> W1015 15:19:09.392750 25185 slave.cpp:3187] Master disconnected! Waiting >> for a new master to be elected >> >> E1015 15:21:21.104575 25191 socket.hpp:174] Shutdown failed on fd=10: >> Transport endpoint is not connected [107] >> >> E1015 15:23:31.664559 25191 socket.hpp:174] Shutdown failed on fd=10: >> Transport endpoint is not connected [107] >> >> > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com

