On 11/30/2017 01:41 PM, Ulrich Windl wrote: > >>>> "Gao,Yan" <y...@suse.com> schrieb am 30.11.2017 um 11:48 in Nachricht > <e71afccc-06e3-97dd-c66a-1b4bac550...@suse.com>: >> On 11/22/2017 08:01 PM, Andrei Borzenkov wrote: >>> SLES12 SP2 with pacemaker 1.1.15-21.1-e174ec8; two node cluster with >>> VM on VSphere using shared VMDK as SBD. During basic tests by killing >>> corosync and forcing STONITH pacemaker was not started after reboot. >>> In logs I see during boot >>> >>> Nov 22 16:04:56 sapprod01s crmd[3151]: crit: We were allegedly >>> just fenced by sapprod01p for sapprod01p >>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]: warning: The crmd >>> process (3151) can no longer be respawned, >>> Nov 22 16:04:56 sapprod01s pacemakerd[3137]: notice: Shutting down >> Pacemaker >>> SBD timeouts are 60s for watchdog and 120s for msgwait. It seems that >>> stonith with SBD always takes msgwait (at least, visually host is not >>> declared as OFFLINE until 120s passed). But VM rebots lightning fast >>> and is up and running long before timeout expires. > As msgwait was intended for the message to arrive, and not for the reboot > time (I guess), this just shows a fundamental problem in SBD design: Receipt > of the fencing command is not confirmed (other than by seeing the > consequences of ist execution).
The 2 x msgwait is not for confirmations but for writing the poison-pill and for having it read by the target-side. Thus it is assumed that within a single msgwait the data is written and confirmed. And if the target-side doesn't manage to do the read within that time it will suicide via watchdog. Thus a working watchdog is a fundamental precondition for sbd to work properly and storage-solutions that are doing caching, replication and stuff without proper syncing are just not suitable for sbd. Regards, Klaus > > So the fencing node will see the other host is down (on the network), but it > won't believe it until SBD msgwait is over. OTOH if your msgwait is very low, > and the storage has a problem (exceeding msgwait), the node will assume a > successful fencing when in fact it didn't complete. > > So maybe there should be two timeouts: One for the command to be delivered > (without needing a confirmation, but the confirmation could shorten the > wait), and another for executing the command (how long will it take from > receipt of the command until the host is definitely down). Again a > confirmation could stop waiting before the timeout is reached. > > Regards, > Ulrich > > >>> I think I have seen similar report already. Is it something that can >>> be fixed by SBD/pacemaker tuning? >> SBD_DELAY_START=yes in /etc/sysconfig/sbd is the solution. >> >> Regards, >> Yan >> >>> I can provide full logs tomorrow if needed. >>> >>> TIA >>> >>> -andrei >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> http://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >>> >> _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org