See also http://mesos.apache.org/documentation/latest/operational-guide/

On Wed, Apr 13, 2016 at 2:44 PM, Adam Bordelon <[email protected]> wrote:

> Having 2 masters (even with a quorum of 2) is no more useful than having a
> single master, since if one of your 2 masters goes down you lose quorum,
> and your cluster will fail to recover, since it cannot write state changes
> to both masters.
>
> Setting the quorum to 1 for a cluster with 2 masters would expose you to
> potential split-brain problems, in case of a network partition. You would
> then have 2 masters that each think they are the leader, and they would be
> unable to reconcile their differences if the partition ends and they
> reconnect to each other.
>
> It is intended that you will have an odd number of masters (similar to the
> requirement to have an odd number of ZKs), and quorum = ceiling(numMasters
> / 2); so you would have 1 master (quorum=1), 3 masters (quorum=2), or 5
> masters (quorum=3).
>
> On Wed, Apr 13, 2016 at 8:44 AM, haosdent <[email protected]> wrote:
>
>> It sounds like an issue in 0.28. I create a ticket
>> https://issues.apache.org/jira/browse/MESOS-5207 from this to continue
>> to
>>  investigate. If @suruchi you could attach logs of mesos masters and your
>> zookeeper configuration, I think it would more helpful for investigating.
>>
>> On Wed, Apr 13, 2016 at 8:24 PM, Stefano Bianchi <[email protected]>
>> wrote:
>>
>>> Thanks for your reply @haosdent.
>>> I destroyed my VM and re build mesos 0.28 with just one master, and now
>>> is working.
>>> i will try to add another master but for the moment, since on openstack
>>> i don't have much resources i need to use that VM as a slave.
>>> However in the previous configuration the switch between two masters was
>>> ok, just when the master was leading after, more or less 30 seconds, there
>>> was that Failed to connect message.
>>>
>>> 2016-04-13 13:08 GMT+02:00 haosdent <[email protected]>:
>>>
>>>> Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes
>>>> you haved? And "but after a while again Failed to connec"​, how long
>>>> the interval here? Is it always "few seconds"?
>>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>

Reply via email to