Re: Mesos Masters Leader Keeps Fluctuating

Stefano Bianchi Thu, 14 Apr 2016 01:50:58 -0700

Ok please follow me in this strange story.
At the beginning i have set 3 mesos master on the same cluster using mesos
0.27
now i deleted one of these 3 mesos 0.27 masters and build a mesos 0.28 to
joint to the other 2.
I get the problem i described after 10 seconds i get failed to connect, but
the election of new one works fine because the new leader is 0.27 which is
stable.
How to send to you logs?


2016-04-13 23:53 GMT+02:00 Adam Bordelon <[email protected]>:

> See also http://mesos.apache.org/documentation/latest/operational-guide/
>
> On Wed, Apr 13, 2016 at 2:44 PM, Adam Bordelon <[email protected]> wrote:
>
>> Having 2 masters (even with a quorum of 2) is no more useful than having
>> a single master, since if one of your 2 masters goes down you lose quorum,
>> and your cluster will fail to recover, since it cannot write state changes
>> to both masters.
>>
>> Setting the quorum to 1 for a cluster with 2 masters would expose you to
>> potential split-brain problems, in case of a network partition. You would
>> then have 2 masters that each think they are the leader, and they would be
>> unable to reconcile their differences if the partition ends and they
>> reconnect to each other.
>>
>> It is intended that you will have an odd number of masters (similar to
>> the requirement to have an odd number of ZKs), and quorum =
>> ceiling(numMasters / 2); so you would have 1 master (quorum=1), 3 masters
>> (quorum=2), or 5 masters (quorum=3).
>>
>> On Wed, Apr 13, 2016 at 8:44 AM, haosdent <[email protected]> wrote:
>>
>>> It sounds like an issue in 0.28. I create a ticket
>>> https://issues.apache.org/jira/browse/MESOS-5207 from this to continue
>>> to
>>>  investigate. If @suruchi you could attach logs of mesos masters and
>>> your zookeeper configuration, I think it would more helpful for
>>> investigating.
>>>
>>> On Wed, Apr 13, 2016 at 8:24 PM, Stefano Bianchi <[email protected]>
>>> wrote:
>>>
>>>> Thanks for your reply @haosdent.
>>>> I destroyed my VM and re build mesos 0.28 with just one master, and now
>>>> is working.
>>>> i will try to add another master but for the moment, since on openstack
>>>> i don't have much resources i need to use that VM as a slave.
>>>> However in the previous configuration the switch between two masters
>>>> was ok, just when the master was leading after, more or less 30 seconds,
>>>> there was that Failed to connect message.
>>>>
>>>> 2016-04-13 13:08 GMT+02:00 haosdent <[email protected]>:
>>>>
>>>>> Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes
>>>>> you haved? And "but after a while again Failed to connec", how long
>>>>> the interval here? Is it always "few seconds"?
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>

Re: Mesos Masters Leader Keeps Fluctuating

Reply via email to