On Mon, Feb 09, 2015 at 04:41:19PM +0100, Lars Ellenberg wrote: > On Fri, Feb 06, 2015 at 04:15:44PM +0100, Dejan Muhamedagic wrote: > > Hi, > > > > On Thu, Feb 05, 2015 at 09:18:50AM +0100, Digimer wrote: > > > That is the problem that makes geo-clustering very hard to nearly > > > impossible. You can look at the Booth option for pacemaker, but that > > > requires two (or more) full clusters, plus an arbitrator 3rd > > > > A full cluster can consist of one node only. Hence, it is > > possible to have a kind of stretch two-node [multi-site] cluster > > based on tickets and managed by booth. > > In theory. > > In practice, we rely on "proper behaviour" of "the other site", > in case a ticket is revoked, or cannot be renewed. > > Relying on a single node for "proper behaviour" does not inspire > as much confidence as relying on a multi-node HA-cluster at each site, > which we can expect to ensure internal fencing. > > With reliable hardware watchdogs, it still should be ok to do > "stretched two node HA clusters" in a reliable way. > > Be generous with timeouts.
As always. > And document which failure modes you expect to handle, > and how to deal with the worst-case scenarios if you end up with some > failure case that you are not equipped to handle properly. > > There are deployments which favor > "rather online with _potential_ split brain" over > "rather offline just in case". There's an arbitrator which should help in case of split brain. > Document this, print it out on paper, > > "I am aware that this may lead to lost transactions, > data divergence, data corruption, or data loss. > I am personally willing to take the blame, > and live with the consequences." > > Have some "boss" sign that ^^^ > in the real world using a real pen. Well, of course running such a "stretch" cluster would be rather different from a "normal" one. The essential thing is that there's no fencing, unless configured as a dead-man switch for the ticket. Given that booth has a "sanity" program hook, maybe that could be utilized to verify if this side of the cluster is healthy enough. Thanks, Dejan > Lars > > -- > : Lars Ellenberg > : http://www.LINBIT.com | Your Way to High Availability > : DRBD, Linux-HA and Pacemaker support and consulting > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org