On 2013-11-26T12:09:41, Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> wrote:
> I saw that I don't have an SBD device any more (it's stopped). Unfortunately I > could not start it (crm resource start prm_stonith_sbd). > I guess it's due to the fact that the cluster won't start resources until the > UNCLEAN node has been fenced. The dog bites ist tail ,it seems... No, this is a harmless visibility change. Pacemaker no longer needs the device resource to be "started" before it can fence - stonithd now directly reads the configuration from the CIB. The device resource only signifies where the monitor ops will be run from, but doesn't really impact the fencing. (Though you can still disable a fence device by setting target-role="Stopped") > The cluster is refusing to work: > cib: [12243]: info: cib_process_diff: Diff 0.620.155 -> 0.621.1 not applied to > 0.617.0: current "epoch" is less than required This should trigger a full resync of the CIB. It seems you diverged during a split-brain scenario. That it can't apply a diff is normal in this case. > Unfortunately and despite of the fact that o2 was shot, the cluster got a > stonith timeout and retried the stonith! This is a very detailed analysis, but it doesn't share any of the facts (such as the configured timeouts). > Could this (on the DC) be the reason? > o4 stonith-ng[17787]: error: crm_abort: call_remote_stonith: Triggered > assert at remote.c:973 : op->state < st_done > stonith-ng[17787]: notice: remote_op_timeout: Action reboot > (97a0476a-7f1d-4986-ba68-0f0d88aeb764) for o2 (crmd.17791) timed out Yeah, that looks annoying. As always, the best way to actually get support would be to raise a support call. (Mailing list activity does take a backseat to customer calls.) Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems