However i have omitted to say that on these machines is running docker, on some machines docker is running a service on other dont, i saw the docker interface typing ifconfig, i guess this is what you mean Dick Davies? Il 19/apr/2016 09:22, "Stefano Bianchi" <[email protected]> ha scritto:
> Actualli the majority of these settings i have already done, out of > /etc/mesos-master/ip, here should i write the ip of master interface ? And > /etc/mesos-slave/ip, here i should write the ip of slave interface ? > Your suggest seems the right one, because if i try to ping some machines > from a network to another someone is reachable some other don't, and these > latters sometimes, commonly at boot, are able to ping and aftwr a while > dont. > Thanks for your suggestion i m going to try. > Il 19/apr/2016 03:12, "Dick Davies" <[email protected]> ha scritto: > >> On our network a lot of the hosts have multiple interfaces, which let >> some asymmetric routing >> issues creep in that prevented our masters replying to slaves, which >> reminded me of your symptoms. >> >> So we set an IP address in /etc/mesos-slave/ip and >> /etc/mesos-master/ip so that they only listen >> on one interface, and then check connectivity between those IPs. >> >> The Ansible repo we use to build the stack now has a 'signoff' >> playbook to check network connectivity >> is correct between the services it deploys to a new environment. >> >> It won't be much use to you on its own I'm afraid, but >> here's a checklist cribbed from that playbook (ports might be >> different in your setup). >> >> You can SSH to the servers and check reachability between them with >> netcat or telnet. >> >> >> zookeepers: >> >> - need to be able to reach each other on the election port (usually >> tcp/3888) >> >> masters: >> >> * must be able to reach zookeepers on tcp/2181 >> * must be able to reach each other on tcp/5050 >> * must be able to reach slaves on tcp/5051 >> >> mesos slaves: >> >> - must be able to reach masters on tcp/5050 >> - must be able to reach zookeepers on tcp/2181 >> - another other connectivity to services your application needs >> (database, caches, whatever) >> >> I think that's it. >> >> On 18 April 2016 at 20:39, Stefano Bianchi <[email protected]> wrote: >> > Hi Dick Davies >> > >> > Could you please share your solution? >> > How did you set up mesos/Zookeeper to interconnect masters and slaves >> among >> > networks? >> > >> > Thanks a lot! >> > >> > 2016-04-18 20:56 GMT+02:00 Dick Davies <[email protected]>: >> >> >> >> +1 for that theory, we had some screwy issues when we tried to span >> >> subnets until we set every slave and master >> >> to listen on a specific IP so we could tie down routing correctly. >> >> >> >> Saw very similar symptoms that have been described. >> >> >> >> On 18 April 2016 at 18:35, Alex Rukletsov <[email protected]> wrote: >> >> > I believe it's because slaves are able to connect to the master, but >> the >> >> > master is not able to connect to the slaves. That's why you see them >> >> > connected for some time and gone afterwards. >> >> > >> >> > On Mon, Apr 18, 2016 at 6:47 PM, Stefano Bianchi < >> [email protected]> >> >> > wrote: >> >> >> >> >> >> Indeed, i dont know why, i am not able to reach all the machines >> from a >> >> >> network to the other, just some machines can interconnect with some >> >> >> others >> >> >> among the networks. >> >> >> On mesos i see that all the slaves at a certain time are all >> connected, >> >> >> then disconnected and after a while connected again, it seems like >> they >> >> >> are >> >> >> able to connect for a while. >> >> >> However is an openstack issue i guess. >> >> >> >> >> >> Does this also happen when master3 is leading? My guess is that >> you're >> >> >> not >> >> >> allowong incoming connections from master1 and master2 to slave3. >> >> >> Generally, >> >> >> masters should be able to connect to slaves, not just respond to >> their >> >> >> requests. >> >> >> >> >> >> On 18 Apr 2016 13:17, "Stefano Bianchi" <[email protected]> >> wrote: >> >> >>> >> >> >>> Hi >> >> >>> On openstack i plugged two virtual networks to the same virtual >> router >> >> >>> so >> >> >>> that the hosts on the 2 networks can communicate each other. >> >> >>> this is my topology: >> >> >>> >> >> >>> -----------------------internet----------------------- >> >> >>> | >> >> >>> Router1 >> >> >>> | >> >> >>> -------------------------------------------------------- >> >> >>> | | >> >> >>> Net1 Net2 >> >> >>> Master1 Master2 Master3 >> >> >>> Slave1 slave2 Slave3 >> >> >>> >> >> >>> I have set zookeeper in with this line: >> >> >>> >> >> >>> zk://Master1_IP:2181,Master2_IP:2181,Master3_IP:2181/mesos >> >> >>> >> >> >>> The 3 masters, even though on 2 separated networks, elect the >> leader >> >> >>> correclty. >> >> >>> Now i have started the slaves, and in a first time i see all 3 >> >> >>> correctly >> >> >>> registered, but after a while the slave 3, independently form who >> is >> >> >>> the >> >> >>> master, disconnects. >> >> >>> I saw in the log and i get the message in the object. >> >> >>> Can you help me to solve this problem? >> >> >>> >> >> >>> >> >> >>> Thanks to all. >> >> > >> >> > >> > >> > >> >

