I have fixed constraints to design for allowing the cluster to continue when one of two nodes fail but preventing it from from starting with only one node. To implement this in a non-intrusive way I'm considering implementing a votequorum device, similar to the proposed qdisk, that would add a vote after a 2->1 transition. The calculation of quorum in corosync appears to be ignored by pacemaker which looks only at the votes and expected votes. If people on the list can give me feedback on this proposal, I'd appreciate it. Alan
On Tue, May 11, 2010 at 1:00 AM, Darren Thompson <[email protected]>wrote: > I'm familiar with other clustering software and the more traditional > approach is to have a quorum requirement (to stop the first node started > from grabbing all the cluster resources) but to implement a nominal (and > hopefully configurable) timeout, after which the quorum requirement is > lifted, allowing the cluster resources to run on whatever nodes are > available at that point. > > This seems a reasonable and pragmatic compromise to these mutually > exclusive requirements and I don't imagine that would be difficult to code. > > Darren > > > > On Tue, 2010-05-11 at 08:10 +0100, Christine Caulfield wrote: > > > On 10/05/10 23:22, Alan Jones wrote: > > Putting the expected votes to one in both corosync and pacemaker allows > > the cluster > > to start with one node (not what I want). > > Sorry, but you can't have it both ways. Either the cluster is allowed to > run with 1 node or it isn't. There is no rule that says "I want the > cluster to run with only one node ONLY if there were previously 2 nodes > and one died, but not if they were booted at different times". > > Though we do accept patches ;-) > > Chrissie > > > Unfortunately, it also does > > not allow the > > cluster to continue with 1 node after a failure because pacemaker > > remembers the > > two node cluster and increases its expected votes. > > The idea of quorum does not seem to be closely coupled between corosync and > > pacemaker. Running with expected votes of two, I halted a node and then > > used > > corosync-quorumtool to set the surviving nodes votes to two. Now > > corosync says > > it has quorum and pacemaker says it does not; i.e. the resources are not > > able to run. > > To sum up - as far as pacemaker behavior the two_node option does not > > seem to > > do anything. Further, if I plan to do quorum logic in corosync for the > > bahavior > > I want, I will also need to explore how to get pacemaker to use it. > > Any comments are welcome. > > Alan > > > > On Mon, May 10, 2010 at 12:17 AM, Christine Caulfield > > <[email protected] <mailto:[email protected] <[email protected]> > >> wrote: > > > > On 08/05/10 01:02, Alan Jones wrote: > > > > I'd like to modify the quorum behavior to require 2 nodes to > > start the > > cluster but allow it to continue with only 1 node after a failure. > > It seemed that the two_node option used with the votequorum provider > > might provide what I'm looking for (corosync.conf section below). > > However, I'm getting the first behavior (requiring 2 nodes to start) > > without the second (continute with only 1 node). > > Should I provide a votequorum device to add another vote after a > > failure? > > Any other ideas? > > Alan > > --- > > quorum { > > provider: corosync_votequorum > > expected_votes: 2 > > votes: 1 > > two_node: 1 > > } > > > > > > > > expected_votes should be set to 1 if you're using the two_node > > option. If you set it to 2, then it will always need both nodes to > > be up ... as you've discovered ;-) > > > > Chrissie > > > > > > _______________________________________________ > Openais mailing > [email protected]https://lists.linux-foundation.org/mailman/listinfo/openais > >
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
