On Thu, 2018-08-30 at 17:24 +0200, Cesar Hernandez wrote: > Hi > > I have a two-node corosync+pacemaker which, starting only one node, > it fences the other node. It's ok as the default behaviour as the > default "startup-fencing" is set to true. > But, the other node is rebooted 3 times, and then, the remaining node > starts resources and doesn't fence the node anymore. > > How can I change these 3 times, to, for example, 1 reboot , or more, > 5? I use a custom fencing script so I'm sure these retries are not > done by the script but pacemaker, and I also see the reboot > operations on the logs: > > Aug 30 17:22:08 [12978] xxxx1 crmd: notice: te_fence_node: > Executing reboot fencing operation (81) on xxxx2 (timeout=180000) > Aug 30 17:22:31 [12978] xxxx1 crmd: notice: te_fence_node: > Executing reboot fencing operation (87) on xxxx2 (timeout=180000) > Aug 30 17:22:48 [12978] xxxx1 crmd: notice: te_fence_node: > Executing reboot fencing operation (89) on xxxx2 (timeout=180000)
Do you mean you have a custom fencing agent configured? If so, check the return value of each attempt. Pacemaker should request fencing only once as long as it succeeds (returns 0), but if the agent fails (returns nonzero or times out), it will retry, even if the reboot worked in reality. If instead you mean you have a script that can request fencing (e.g. via stonith_admin), then check the logs before each attempt to see if the request was initiated by the cluster (which should show a policy engine transition for it) or your script. FYI, corosync 2 has a "two_node" setting that includes "wait_for_all" -- with that, you don't need to ignore quorum in pacemaker, and the cluster won't start until both nodes have seen each other at least once. > Software versions: > > corosync-1.4.8 > crmsh-2.1.5 > libqb-0.17.2 > Pacemaker-1.1.14 > resource-agents-3.9.6 > Reusable-Cluster-Components-glue--glue-1.0.12 > > Some parameters: > > property cib-bootstrap-options: \ > have-watchdog=false \ > dc-version=1.1.14-70404b0e5e \ > cluster-infrastructure="classic openais (with plugin)" \ > expected-quorum-votes=2 \ > stonith-enabled=true \ > no-quorum-policy=ignore \ > default-resource-stickiness=200 \ > stonith-timeout=180s \ > last-lrm-refresh=1534489943 > > > Thanks > > César Hernández Bañó -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org