Ah..that was while I was setting things up. I guess if one master goes 
down, I'm temporarily in that scenario, hence I'd expect it to work - 
which it does. I think the issue was that the apt-get install started the 
service, and I changed settings, and it needed a clearing up of the cached 
information.

Date: Tue, 7 Jul 2015 10:33:20 +0200
Subject: Re: Lots of master elections
From: [email protected]
To: [email protected]

Got it. I was confused by your first email where you said you have 2 masters.
On Tue, Jul 7, 2015 at 4:40 AM, Ashic Mahtab <[email protected]> wrote:



Sure, Alex.
3 masters. Quorum is 2.

Date: Mon, 6 Jul 2015 19:44:28 +0200
Subject: Re: Lots of master elections
From: [email protected]
To: [email protected]

Ashic,
great that you solved the issue. Could you please clarify what HA configuration 
you have: how many masters and what --quorum you use?
On Sat, Jul 4, 2015 at 5:09 PM, Ashic Mahtab <[email protected]> wrote:



Hi Nikolaos,I'm using an external zk, so didn't need to restart it. 
I might have jumped the gun slightly in the last email. It seems completely 
omitting hostname in /etc/default/mesos-master is fine. Simply having a file 
called hostname in /etc/mesos-master with the desired hostname as the content 
seems to fix it for the web UI redirects. I had the file on the host I was 
setting up scripts with, but forgot to add that step for the others, hence the 
private ip redirects.
So, to summarise, I did the following:* Stopped mesos masters.* cleared 
/var/lib/mesos/** for each node, added /etc/mesos-master/hostname with the 
content of the file being the fully qualified hostname.* deleted the /mesos 
node in zk (though I don't know if this is necessary).* restarted each node.
Again, your guidance has helped greatly.
Cheers,Ashic.

From: [email protected]
To: [email protected]
Subject: Re: Lots of master elections
Date: Sat, 4 Jul 2015 14:25:27 +0000






Hi,
in my case, in order to make it work, I do the following:
1)stop zookeeper - mesos 
2)clean /var/lib/mesos/* /var/lib/zookeeper/version-X
3)restart zookeeper nodes
4)restart all mesos masters


















Nikolaos Ballas 
 |  Software Development Manager 







Technology Nexus S.a.r.l.

2-4 Rue Eugene Rupert

2453 Luxembourg

Delivery address: 2-3 Rue Eugene Rupert,Vertigo Polaris Building

Tel: + 3522619113580

[email protected] | nexusgroup.com 

LinkedIn.com | Twitter | Facebook.com


















On 04 Jul 2015, at 15:04, Ashic Mahtab <[email protected]> wrote:




Hm...will delete everything in /var/lib/mesos (which are replicated logs), and 
retry. Guess I don't need to delete mesos things under /etc, then. Will report 
back. Checking the logs, I see that a master is elected but then writes this to 
FATAL:



F0704 12:52:38.078475  5847 master.cpp:1176] Recovery failed: Failed to recover 
registrar: Failed to perform fetch within 1mins



Then dies. Guess that's kicking off the new election.



-Ashic.





From: [email protected]

To: [email protected]

Subject: RE: Lots of master elections

Date: Sat, 4 Jul 2015 12:47:53 +0000



Based on your configuration under /var/  mesos creates  files. Under the 
directory mesos. Go inside var and run on command line find . - name *mesos* 










Sent from my Samsung device





-------- Original message --------

From: Ashic Mahtab <[email protected]> 

Date: 04/07/2015 14:34 (GMT+01:00) 

To: Apache Mesos <[email protected]> 

Subject: RE: Lots of master elections 




Thanks for the reply, Niklaos. Extrme noob question...when you say mesos files, 
which are you referring to? Would I also need to delete the /mesos value in 
Zookeeper?





From: [email protected]

To: [email protected]

Subject: RE: Lots of master elections

Date: Sat, 4 Jul 2015 12:29:44 +0000



You have to  clean the mesos files and restart the masters 










Sent from my Samsung device





-------- Original message --------

From: Ashic Mahtab <[email protected]> 

Date: 04/07/2015 14:08 (GMT+01:00) 

To: [email protected] 

Subject: Lots of master elections 




Hello,
Just getting started with Mesos, and in the process of "graduating" from 
Vagrant to a cluster on Azure. Here's what I have:



* 1 Zookeeper node exposing 2181, running as expected.
* 2 Mesos masters - mesos1.x.net,
mesos2.x.net. Both exposing 5050. These have private and public ips. All nodes 
are on the same network, and have access to each other.



[I'll set up a third master, and add slaves soon.]



It all seems ok, and the web UI works. I can see mesos entries in Zookeeper. 
However, I've seeing a couple of things:



* A node is elected master. And about a minute later, another election is held. 
(say,
mesos1.x.net)
* If the other node wins, in the UI, I get the message that this is no longer 
the master and am redirected.
* Sometimes the redirection is to 
mesos2.x.net, and all is fine (except another election soon). 
* Sometimes the redirection is to the internal ip of 
mesos2.x.net, which obviously gets a 404.



I should add that all the nodes are the lowest powered crappy Azure instances 
you can get. 



Is this constant re-election "normal"? Should I specify hostnames or public ips 
in /etc/default/mesos-master? I tried the latter, but the symptoms remained. 
Will adding a a third master make it work? (I have quorum set to 2).



Any help will be greatly appreciated.



Thanks,
Ashic.













                                          

                                          

                                          

Reply via email to