On Wed, Aug 30, 2023 at 2:34 PM David Dolan <daithido...@gmail.com> wrote:
> Hi All, > > I'm running Pacemaker on Centos7 > Name : pcs > Version : 0.9.169 > Release : 3.el7.centos.3 > Architecture: x86_64 > > Besides the pcs-version versions of the other cluster-stack-components could be interesting. (pacemaker, corosync) > I'm performing some cluster failover tests in a 3 node cluster. We have 3 > resources in the cluster. > I was trying to see if I could get it working if 2 nodes fail at different > times. I'd like the 3 resources to then run on one node. > > The quorum options I've configured are as follows > [root@node1 ~]# pcs quorum config > Options: > auto_tie_breaker: 1 > last_man_standing: 1 > last_man_standing_window: 10000 > wait_for_all: 1 > > Not sure if the combination of auto_tie_breaker and last_man_standing makes sense. And as you have a cluster with an odd number of nodes auto_tie_breaker should be disabled anyway I guess. > [root@node1 ~]# pcs quorum status > Quorum information > ------------------ > Date: Wed Aug 30 11:20:04 2023 > Quorum provider: corosync_votequorum > Nodes: 3 > Node ID: 1 > Ring ID: 1/1538 > Quorate: Yes > > Votequorum information > ---------------------- > Expected votes: 3 > Highest expected: 3 > Total votes: 3 > Quorum: 2 > Flags: Quorate WaitForAll LastManStanding AutoTieBreaker > > Membership information > ---------------------- > Nodeid Votes Qdevice Name > 1 1 NR node1 (local) > 2 1 NR node2 > 3 1 NR node3 > > If I stop the cluster services on node 2 and 3, the groups all failover to > node 1 since it is the node with the lowest ID > But if I stop them on node1 and node 2 or node1 and node3, the cluster > fails. > > I tried adding this line to corosync.conf and I could then bring down the > services on node 1 and 2 or node 2 and 3 but if I left node 2 until last, > the cluster failed > auto_tie_breaker_node: 1 3 > > This line had the same outcome as using 1 3 > auto_tie_breaker_node: 1 2 3 > > Giving multiple auto_tie_breaker-nodes doesn't make sense to me but rather sounds dangerous if that configuration is possible at all. Maybe the misbehavior of last_man_standing is due to this (maybe not recognized) misconfiguration. Did you wait long enough between letting the 2 nodes fail? Klaus > So I'd like it to failover when any combination of two nodes fail but I've > only had success when the middle node isn't last. > > Thanks > David > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/