To follow up on this matter, I've solved the problem and it was not a
bug in pacemaker.
fencing/commands.c checks to see if it can fence the node with
fence_legacy. If this fails commands.c will disable checking for the
remainder of the pacemaker session and throw an error to syslog when
the stoni
I do not know where this timeout is coming from. It appears to be
independent of any cluster settings.
I've tested causing a node to be marked UNCLEAN by messing with a
VirtualDomain RA.
Expected results: STONITH
Observed results: Timeout. No STONITH.
sdgxen-2 logs:
Nov 30 15:26:06 sdgxen-2 peng
On 11/29/2011 12:14 AM, Hal Martin wrote:
> Sorry; they were included in the previous email but it appears it was
> not properly spaced to be noticeable in the wall of text.
Indeed ... already there, sorry for the noise.
strange ... where does this timeout come from? I don't see an evidence
this
Sorry; they were included in the previous email but it appears it was
not properly spaced to be noticeable in the wall of text.
Syslog from sdgxen-3:
Nov 28 15:01:20 sdgxen-3 attrd: [455]: notice: attrd_ais_dispatch:
Update relayed from sdgxen-2
Nov 28 15:01:20 sdgxen-3 attrd: [455]: notice: attrd
On 11/28/2011 08:07 PM, Hal Martin wrote:
> Thank you for the updated link.
>
> I have recompiled pacemaker from checkout b9889764 and stonith still
> fails to shoot nodes.
Maybe posting also the logs from sdgxen-3 can help.
Regards,
Andreas
--
Need help with Pacemaker?
http://www.hastexo.com/
Thank you for the updated link.
I have recompiled pacemaker from checkout b9889764 and stonith still
fails to shoot nodes.
sdgxen-2:/ # crm node fence sdgxen-3
Do you really want to shoot sdgxen-3? y
Syslog from sdgxen-2:
Nov 28 15:01:20 sdgxen-2 pengine: [456]: WARN: pe_fence_node: Node
sdgxen-
On Mon, Nov 28, 2011 at 4:35 PM, Hal Martin wrote:
> Looking at the mercurial repository for pacemaker
> (http://hg.clusterlabs.org/pacemaker/) I do not see any check-ins
> since 1.1.6 was tagged two months ago.
Pacemaker has since moved to GitHub:
https://github.com/ClusterLabs/pacemaker
Hope
Looking at the mercurial repository for pacemaker
(http://hg.clusterlabs.org/pacemaker/) I do not see any check-ins
since 1.1.6 was tagged two months ago.
Does this mean the timeout bug has been fixed outside of the mercurial
repository? This bug is a huge show-stopper for me, and if there is
fixe
>>> Andrew Beekhof schrieb am 24.11.2011 um 01:14 in
>>> Nachricht
:
> On Thu, Nov 24, 2011 at 1:42 AM, Hal Martin wrote:
>
[...]
> > # stonith -t external/sbd sbd_device=/dev/mapper/qa-test-sbd -S
> > info: external/sbd device OK.
> >
> > Relevant portions of crm config:
> > primitive stonith-