Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when both nodes are rebooted together

ihjaz Mohamed Mon, 24 Oct 2011 08:27:29 -0700

Its part of the requirement given to me to support this solution on servers 
without stonith devices. So I cannot enable the stonith.




________________________________
From: Alan Robertson <al...@unix.sh>
To: ihjaz Mohamed <ihjazmoha...@yahoo.co.in>; The Pacemaker cluster resource 
manager <pacemaker@oss.clusterlabs.org>
Sent: Monday, 24 October 2011 8:22 PM
Subject: Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when both 
nodes are rebooted together


 
Setting no-quorum-policy to ignore and disabling stonith is not a good idea.  
You're sort of inviting the cluster to do screwed up things.


On 10/24/2011 08:23 AM, ihjaz Mohamed wrote: 
Hi All,
>
>
>I 've pacemaker running with corosync. Following is my CRM configuration.
>
>
>node soalaba56
>node soalaba63
>primitive FloatingIP ocf:heartbeat:IPaddr2 \
>        params ip="<floating_ip>" nic="eth0:0"
>primitive acestatus lsb:acestatus \
>primitive pingd ocf:pacemaker:ping \
>        params host_list="<gateway_ip>"
            multiplier="100" \
>        op monitor interval="15s" timeout="5s"
>group HAService FloatingIP acestatus \
>        meta target-role="Started"
>clone pingdclone pingd \
>        meta globally-unique="false"
>location ip1_location FloatingIP \
>        rule $id="ip1_location-rule" pingd: defined pingd
>property $id="cib-bootstrap-options" \
>       
            dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
            \
>        cluster-infrastructure="openais" \
>        expected-quorum-votes="2" \
>        stonith-enabled="false" \
>        no-quorum-policy="ignore" \
>        last-lrm-refresh="1305736421"
>----------------------------------------------------------------------
>
>
>When I reboot both the nodes together, cluster goes into an (unmanaged) Failed 
>state as shown below.
>
>
>
>
>============
>Last updated: Mon Oct 24 08:10:42 2011
>Stack: openais
>Current DC: soalaba63 - partition with quorum
>Version:
            1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
>2 Nodes configured, 2 expected votes
>2 Resources configured.
>============
>
>Online: [ soalaba56 soalaba63 ]
>
> Resource Group: HAService
>     FloatingIP (ocf::heartbeat:IPaddr2) Started 
            (unmanaged) FAILED[   soalaba63       soalaba56 ]
>     acestatus  (lsb:acestatus):        Stopped
> Clone Set: pingdclone [pingd]
>     Started: [ soalaba56 soalaba63 ]
>
>Failed actions:
>    FloatingIP_stop_0 (node=soalaba63, call=7, rc=1,
            status=complete): unknown error
>    FloatingIP_stop_0 (node=soalaba56, call=7, rc=1,
            status=complete): unknown error
>
>------------------------------------------------------------------------------
>
>
>
>This happens only when the reboot is done simultaneously on both the nodes. If 
>reboot is done with some interval in between this is not seen. Looking into 
>the logs I see that  when the nodes come up resources are started on both the 
>nodes and then it tries to stop the started resources and fails there. 
>
>
>I've attached the logs.
>
>
>
>
>
>
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: 
http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: 
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker 


--  Alan Robertson <al...@unix.sh> "Openness is the foundation and preservative 
of friendship...  Let me claim from you at all times your undisguised 
opinions." - William Wilberforce

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when both nodes are rebooted together

Reply via email to