Ok...so added hostname=mesosn.x.net to each node's /etc/default/mesos-master, 
cleared everything in var/lib/mesos, cleared /mesos in zk and started the mesos 
services. 
The elected master is holding steady, but going to a non-leader master in the 
browser still redirects to the internal ip address of the master, instead of 
mesosn.x.net. Is there a way to force it to use the mesosn.x.net address 
instead of the internal ip?

From: [email protected]
To: [email protected]
Subject: RE: Lots of master elections
Date: Sat, 4 Jul 2015 14:28:04 +0100




Deleting /var/lib/mesos/* (which is essentially replicated logs), and deleting 
the /mesos node in ZK, then restarting everything seems to have improved 
things. Single election, and no re-elections or FATAL logs. However, going to a 
non-leader master redirects to internal ip. Will see what happens if I add 
hostnames back in.
From: [email protected]
To: [email protected]
Subject: RE: Lots of master elections
Date: Sat, 4 Jul 2015 14:04:34 +0100




Hm...will delete everything in /var/lib/mesos (which are replicated logs), and 
retry. Guess I don't need to delete mesos things under /etc, then. Will report 
back. Checking the logs, I see that a master is elected but then writes this to 
FATAL:
F0704 12:52:38.078475  5847 master.cpp:1176] Recovery failed: Failed to recover 
registrar: Failed to perform fetch within 1mins
Then dies. Guess that's kicking off the new election.
-Ashic.

From: [email protected]
To: [email protected]
Subject: RE: Lots of master elections
Date: Sat, 4 Jul 2015 12:47:53 +0000








Based on your configuration under /var/  mesos creates  files. Under the 
directory mesos. Go inside var and run on command line find . - name *mesos* 










Sent from my Samsung device





-------- Original message --------

From: Ashic Mahtab <[email protected]> 

Date: 04/07/2015 14:34 (GMT+01:00) 

To: Apache Mesos <[email protected]> 

Subject: RE: Lots of master elections 




Thanks for the reply, Niklaos. Extrme noob question...when you say mesos files, 
which are you referring to? Would I also need to delete the /mesos value in 
Zookeeper?





From: [email protected]

To: [email protected]

Subject: RE: Lots of master elections

Date: Sat, 4 Jul 2015 12:29:44 +0000




You have to  clean the mesos files and restart the masters 










Sent from my Samsung device





-------- Original message --------

From: Ashic Mahtab <[email protected]> 

Date: 04/07/2015 14:08 (GMT+01:00) 

To: [email protected] 

Subject: Lots of master elections 




Hello,
Just getting started with Mesos, and in the process of "graduating" from 
Vagrant to a cluster on Azure. Here's what I have:



* 1 Zookeeper node exposing 2181, running as expected.
* 2 Mesos masters - mesos1.x.net, mesos2.x.net. Both exposing 5050. These have 
private and public ips. All nodes are on the same network, and have access to 
each other.



[I'll set up a third master, and add slaves soon.]



It all seems ok, and the web UI works. I can see mesos entries in Zookeeper. 
However, I've seeing a couple of things:



* A node is elected master. And about a minute later, another election is held. 
(say, mesos1.x.net)
* If the other node wins, in the UI, I get the message that this is no longer 
the master and am redirected.
* Sometimes the redirection is to mesos2.x.net, and all is fine (except another 
election soon). 
* Sometimes the redirection is to the internal ip of mesos2.x.net, which 
obviously gets a 404.



I should add that all the nodes are the lowest powered crappy Azure instances 
you can get. 



Is this constant re-election "normal"? Should I specify hostnames or public ips 
in /etc/default/mesos-master? I tried the latter, but the symptoms remained. 
Will adding a a third master make it work? (I have quorum set to 2).



Any help will be greatly appreciated.



Thanks,
Ashic.







                                                                                
                                          

Reply via email to