On 2013-11-26T12:09:41, Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> wrote:

> I saw that I don't have an SBD device any more (it's stopped). Unfortunately I
> could not start it (crm resource start prm_stonith_sbd).
> I guess it's due to the fact that the cluster won't start resources until the
> UNCLEAN node has been fenced. The dog bites ist tail ,it seems...

No, this is a harmless visibility change. Pacemaker no longer needs the
device resource to be "started" before it can fence - stonithd now
directly reads the configuration from the CIB.

The device resource only signifies where the monitor ops will be run
from, but doesn't really impact the fencing. (Though you can still
disable a fence device by setting target-role="Stopped")


> The cluster is refusing to work:
> cib: [12243]: info: cib_process_diff: Diff 0.620.155 -> 0.621.1 not applied to
> 0.617.0: current "epoch" is less than required

This should trigger a full resync of the CIB. It seems you diverged
during a split-brain scenario. That it can't apply a diff is normal in
this case.

> Unfortunately and despite of the fact that o2 was shot, the cluster got a
> stonith timeout and retried the stonith!

This is a very detailed analysis, but it doesn't share any of the facts
(such as the configured timeouts).

> Could this (on the DC) be the reason?
> o4 stonith-ng[17787]:    error: crm_abort: call_remote_stonith: Triggered
> assert at remote.c:973 : op->state < st_done
> stonith-ng[17787]:   notice: remote_op_timeout: Action reboot
> (97a0476a-7f1d-4986-ba68-0f0d88aeb764) for o2 (crmd.17791) timed out

Yeah, that looks annoying. As always, the best way to actually get
support would be to raise a support call.

(Mailing list activity does take a backseat to customer calls.)


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to