Actualli the majority of these settings i have already done, out of /etc/mesos-master/ip, here should i write the ip of master interface ? And /etc/mesos-slave/ip, here i should write the ip of slave interface ? Your suggest seems the right one, because if i try to ping some machines from a network to another someone is reachable some other don't, and these latters sometimes, commonly at boot, are able to ping and aftwr a while dont. Thanks for your suggestion i m going to try. Il 19/apr/2016 03:12, "Dick Davies" <[email protected]> ha scritto:
> On our network a lot of the hosts have multiple interfaces, which let > some asymmetric routing > issues creep in that prevented our masters replying to slaves, which > reminded me of your symptoms. > > So we set an IP address in /etc/mesos-slave/ip and > /etc/mesos-master/ip so that they only listen > on one interface, and then check connectivity between those IPs. > > The Ansible repo we use to build the stack now has a 'signoff' > playbook to check network connectivity > is correct between the services it deploys to a new environment. > > It won't be much use to you on its own I'm afraid, but > here's a checklist cribbed from that playbook (ports might be > different in your setup). > > You can SSH to the servers and check reachability between them with > netcat or telnet. > > > zookeepers: > > - need to be able to reach each other on the election port (usually > tcp/3888) > > masters: > > * must be able to reach zookeepers on tcp/2181 > * must be able to reach each other on tcp/5050 > * must be able to reach slaves on tcp/5051 > > mesos slaves: > > - must be able to reach masters on tcp/5050 > - must be able to reach zookeepers on tcp/2181 > - another other connectivity to services your application needs > (database, caches, whatever) > > I think that's it. > > On 18 April 2016 at 20:39, Stefano Bianchi <[email protected]> wrote: > > Hi Dick Davies > > > > Could you please share your solution? > > How did you set up mesos/Zookeeper to interconnect masters and slaves > among > > networks? > > > > Thanks a lot! > > > > 2016-04-18 20:56 GMT+02:00 Dick Davies <[email protected]>: > >> > >> +1 for that theory, we had some screwy issues when we tried to span > >> subnets until we set every slave and master > >> to listen on a specific IP so we could tie down routing correctly. > >> > >> Saw very similar symptoms that have been described. > >> > >> On 18 April 2016 at 18:35, Alex Rukletsov <[email protected]> wrote: > >> > I believe it's because slaves are able to connect to the master, but > the > >> > master is not able to connect to the slaves. That's why you see them > >> > connected for some time and gone afterwards. > >> > > >> > On Mon, Apr 18, 2016 at 6:47 PM, Stefano Bianchi < > [email protected]> > >> > wrote: > >> >> > >> >> Indeed, i dont know why, i am not able to reach all the machines > from a > >> >> network to the other, just some machines can interconnect with some > >> >> others > >> >> among the networks. > >> >> On mesos i see that all the slaves at a certain time are all > connected, > >> >> then disconnected and after a while connected again, it seems like > they > >> >> are > >> >> able to connect for a while. > >> >> However is an openstack issue i guess. > >> >> > >> >> Does this also happen when master3 is leading? My guess is that > you're > >> >> not > >> >> allowong incoming connections from master1 and master2 to slave3. > >> >> Generally, > >> >> masters should be able to connect to slaves, not just respond to > their > >> >> requests. > >> >> > >> >> On 18 Apr 2016 13:17, "Stefano Bianchi" <[email protected]> > wrote: > >> >>> > >> >>> Hi > >> >>> On openstack i plugged two virtual networks to the same virtual > router > >> >>> so > >> >>> that the hosts on the 2 networks can communicate each other. > >> >>> this is my topology: > >> >>> > >> >>> -----------------------internet----------------------- > >> >>> | > >> >>> Router1 > >> >>> | > >> >>> -------------------------------------------------------- > >> >>> | | > >> >>> Net1 Net2 > >> >>> Master1 Master2 Master3 > >> >>> Slave1 slave2 Slave3 > >> >>> > >> >>> I have set zookeeper in with this line: > >> >>> > >> >>> zk://Master1_IP:2181,Master2_IP:2181,Master3_IP:2181/mesos > >> >>> > >> >>> The 3 masters, even though on 2 separated networks, elect the leader > >> >>> correclty. > >> >>> Now i have started the slaves, and in a first time i see all 3 > >> >>> correctly > >> >>> registered, but after a while the slave 3, independently form who is > >> >>> the > >> >>> master, disconnects. > >> >>> I saw in the log and i get the message in the object. > >> >>> Can you help me to solve this problem? > >> >>> > >> >>> > >> >>> Thanks to all. > >> > > >> > > > > > >

