Since there is always a period between checking you are master and performing an action as master, so can't guarantee that another node hasn't taken mastership before you perform the action. However, if you are using some shared storage for the state of the system, you can block other masters from writing to it while you are master, preventing split brain from occurring.
The simplest way to do this is to store all your state in zookeeper. With this, if B is partitioned away, it will not be able to update the state. This won't scale too far though, as zk holds everything in memory. It may work if the system isn't too big though. Another solution is to use Bookkeeper(http://zookeeper.apache.org/bookkeeper), or another shared storage with fencing, to sequence the updates to your state. Bookkeeper is a distributed write ahead log, which writes entries to a quorum before responding to client. It has a fencing mechanism which sends a 'fence' message to at least one node in each quorum, blocking all further writes to that log. In your case, if a node in B is master and is putting all state updates into a bookkeeper log before applying them, and the there is a partition and a node in A becomes master, A will fence B's log before applying any state updates of it's own. Yet another solution, though I don't know how well it would work, if to use locks in NFS. If B is logging to a file on a SAN, it get an exclusive lock on the file handle. This will block anyone else from logging to it until B's NFS session goes away. I'm not sure how long it takes for sessions to timeout though, or how widely implemented or reliable this part of the NFS spec is though. Hope this helps, Ivan On Tue, Nov 26, 2013 at 12:54:44AM -0800, ms209495 wrote: > Hi, > > ZooKeeper is an excellent system but the problem with ensuring only one > master among clients bothers me. > > Lets have a look at the situation when network partition happen: there is > part A (majority), and part B (minority). > Lets assume that before network partition happened the master was connected > to part B. > After the network partition, part A will elect new ZooKeeper leader, and > there will be new master elected among clients connected to part A. > At this time there are two masters - old in part B, and new in part A. > The only solution I can think about to this problem, is to ensure that the > new master is inactive for some time - to ensure that the old master in this > time will detect that it is not connected to ZooKeeper quorum, and will > deactivate itself as a master. > This solution assumes that timers on these machines work correctly. > Is it possible to ensure only one master using ZooKeeper without timing > assumptions ? > > Thanks, > Maciej > > > > -- > View this message in context: > http://zookeeper-user.578899.n2.nabble.com/Ensure-there-is-one-master-tp7579367.html > Sent from the zookeeper-user mailing list archive at Nabble.com.
