Thanks Klaus\Andrei, So if I understand correctly what I'm trying probably shouldn't work. And I should attempt setting auto_tie_breaker in corosync and remove last_man_standing. Then, I should set up another server with qdevice and configure that using the LMS algorithm.
Thanks David On Mon, 4 Sept 2023 at 13:32, Klaus Wenninger <kwenn...@redhat.com> wrote: > > > On Mon, Sep 4, 2023 at 1:50 PM Andrei Borzenkov <arvidj...@gmail.com> > wrote: > >> On Mon, Sep 4, 2023 at 2:18 PM Klaus Wenninger <kwenn...@redhat.com> >> wrote: >> > >> > >> > >> > On Mon, Sep 4, 2023 at 12:45 PM David Dolan <daithido...@gmail.com> >> wrote: >> >> >> >> Hi Klaus, >> >> >> >> With default quorum options I've performed the following on my 3 node >> cluster >> >> >> >> Bring down cluster services on one node - the running services migrate >> to another node >> >> Wait 3 minutes >> >> Bring down cluster services on one of the two remaining nodes - the >> surviving node in the cluster is then fenced >> >> >> >> Instead of the surviving node being fenced, I hoped that the services >> would migrate and run on that remaining node. >> >> >> >> Just looking for confirmation that my understanding is ok and if I'm >> missing something? >> > >> > >> > As said I've never used it ... >> > Well when down to 2 nodes LMS per definition is getting into trouble as >> after another >> > outage any of them is gonna be alone. In case of an ordered shutdown >> this could >> > possibly be circumvented though. So I guess your fist attempt to enable >> auto-tie-breaker >> > was the right idea. Like this you will have further service at least on >> one of the nodes. >> > So I guess what you were seeing is the right - and unfortunately only >> possible - behavior. >> >> I still do not see where fencing comes from. Pacemaker requests >> fencing of the missing nodes. It also may request self-fencing, but >> not in the default settings. It is rather hard to tell what happens >> without logs from the last remaining node. >> >> That said, the default action is to stop all resources, so the end >> result is not very different :) >> > > But you are of course right. The expected behaviour would be that > the leftover node stops the resources. > But maybe we're missing something here. Hard to tell without > the exact configuration including fencing. > Again, as already said, I don't know anything about the LMS > implementation with corosync. In theory there were both arguments > to either suicide (but that would have to be done by pacemaker) or > to automatically switch to some 2-node-mode once the remaining > partition is reduced to just 2 followed by a fence-race (when done > without the precautions otherwise used for 2-node-clusters). > But I guess in this case it is none of those 2. > > Klaus > >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/