On Tue, 2018-04-10 at 07:26 +0000, Stefan Schlösser wrote: > Hi, > > I have a 3 node setup on ubuntu 16.04. Corosync/Pacemaker services > are not started automatically. > > If I put all 3 nodes to offline mode, with 1 node in an „unclean“ > state I get a never ending STONITH. > > What happens is that the STONITH causes a reboot of the unclean node. > > 1) I would have thought with all nodes in standby no STONITH can > occur. Why does it?
Standby prevents a node from running resources, but it still participates in quorum voting. I suspect *starting* a node in standby mode would prevent it from using fence devices, but *changing* a node to standby will have no effect on whether it can fence. > 2) Why does it keep on killing the unclean node? Good question. The DC's logs will have the most useful information -- each pengine run should say why fencing is being scheduled. > > The only way to stop it, is to temporarily disable stonith and bring > the unclean node back online manually, and the enable it again. > > Here is a log extract of node c killing node a: > Apr 10 09:08:30 [2276] xxx-c stonith-ng: notice: log_operation: > Operation 'reboot' [2428] (call 5 from crmd.2175) for host 'xxx-a' > with device 'stonith_a' returned: 0 (OK) > Apr 10 09:08:30 [2276] xxx-c stonith-ng: notice: remote_op_done: > Operation reboot of xxx-a by xxx-c for crmd.2175@xxx-b.20531831: OK > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_modify operation for section > status: OK (rc=0, origin=xxx-b/crmd/83, version=0.164.37) > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_delete operation for section > //node_state[@uname='xxx-a']/lrm: OK (rc=0, origin=xxx-b/crmd/84, > version=0.164.37) > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_delete operation for section > //node_state[@uname='xxx-a']/transient_attributes: OK (rc=0, > origin=xxx-b/crmd/85, version=0.164.37) > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_modify operation for section > status: OK (rc=0, origin=xxx-b/crmd/86, version=0.164.37) > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_delete operation for section > //node_state[@uname='xxx-a']/lrm: OK (rc=0, origin=xxx-b/crmd/87, > version=0.164.37) > Apr 10 09:08:30 [2275] xxx-c cib: info: > cib_process_request: Completed cib_delete operation for section > //node_state[@uname='xxx-a']/transient_attributes: OK (rc=0, > origin=xxx-b/crmd/88, version=0.164.37) > > This the repeats forevermore ... > > Thanks for any hints, > > cheers, > > Stefan -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org