Having 2 masters (even with a quorum of 2) is no more useful than having a single master, since if one of your 2 masters goes down you lose quorum, and your cluster will fail to recover, since it cannot write state changes to both masters.
Setting the quorum to 1 for a cluster with 2 masters would expose you to potential split-brain problems, in case of a network partition. You would then have 2 masters that each think they are the leader, and they would be unable to reconcile their differences if the partition ends and they reconnect to each other. It is intended that you will have an odd number of masters (similar to the requirement to have an odd number of ZKs), and quorum = ceiling(numMasters / 2); so you would have 1 master (quorum=1), 3 masters (quorum=2), or 5 masters (quorum=3). On Wed, Apr 13, 2016 at 8:44 AM, haosdent <[email protected]> wrote: > It sounds like an issue in 0.28. I create a ticket > https://issues.apache.org/jira/browse/MESOS-5207 from this to continue to > investigate. If @suruchi you could attach logs of mesos masters and your > zookeeper configuration, I think it would more helpful for investigating. > > On Wed, Apr 13, 2016 at 8:24 PM, Stefano Bianchi <[email protected]> > wrote: > >> Thanks for your reply @haosdent. >> I destroyed my VM and re build mesos 0.28 with just one master, and now >> is working. >> i will try to add another master but for the moment, since on openstack i >> don't have much resources i need to use that VM as a slave. >> However in the previous configuration the switch between two masters was >> ok, just when the master was leading after, more or less 30 seconds, there >> was that Failed to connect message. >> >> 2016-04-13 13:08 GMT+02:00 haosdent <[email protected]>: >> >>> Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes >>> you haved? And "but after a while again Failed to connec", how long >>> the interval here? Is it always "few seconds"? >>> >> >> > > > -- > Best Regards, > Haosdent Huang >

