On Tue, Apr 3, 2012 at 2:06 PM, Rainer Krienke <[email protected]> wrote:
> Am 03.04.2012 11:44, schrieb Lars Marowsky-Bree:
>
>>> property $id="cib-bootstrap-options" \
>>>         dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \
>>>         cluster-infrastructure="openais" \
>>>         expected-quorum-votes="2" \
>>>         stonith-timeout="30s" \
>>>         no-quorum-policy="ignore" \
>>>         stonith-enabled="false"
>>
>> stonith-enabled="true" an all shall be well.
>>
>> (Though you may want to use multiple SBD devices to protect against loss
>> of a single device.)
>
> Hi to all,
>
> thanks for the hint to enable the stonith resource. I did and checked
> that it is set to true now, but after all the behaviour of the cluster
> is still the same, if I do a halt -f on one node.
> Access on the clusterfilesystem on the still running node simply hangs.
>
> crm_mon -1 in this case shows this (note: the nodes names are: rzinstal4
> and rzinstal5):
>
> Last updated: Tue Apr  3 13:58:10 2012
> Last change: Tue Apr  3 13:41:56 2012 by root via cibadmin on rzinstal4
> Stack: openais
> Current DC: rzinstal4 - partition WITHOUT quorum
> Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e
> 2 Nodes configured, 2 expected votes
> 7 Resources configured.
> ============
>
> Node rzinstal5: UNCLEAN (offline)
> Online: [ rzinstal4 ]
>
>  Clone Set: base-clone [base-group]
>     Started: [ rzinstal4 ]
>     Stopped: [ base-group:1 ]
>  stonith_sbd    (stonith:external/sbd): Started rzinstal4
>
>
> crm_verify -V -L says this:
>
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: pe_fence_node: Node
> rzinstal5 will be fenced because it is un-expectedly down
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: determine_online_status:
> Node rzinstal5 is unclean
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Action
> dlm:1_stop_0 on rzinstal5 is unrunnable (offline)
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Marking node
> rzinstal5 unclean
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Action
> o2cb:1_stop_0 on rzinstal5 is unrunnable (offline)
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Marking node
> rzinstal5 unclean
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Action
> ocfs2-1:1_stop_0 on rzinstal5 is unrunnable (offline)
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: custom_action: Marking node
> rzinstal5 unclean
> crm_verify[10218]: 2012/04/03_14:00:00 WARN: stage6: Scheduling Node
> rzinstal5 for STONITH
> Warnings found during check: config may not be valid
>
>
> No idea what the reason might be ...

Ahem. A Google search for "Pacemaker unclean node" or even just
"Pacemaker unclean" would have turned up the answer in about one
second.

Although your STONITH is now configured, and your node is correctly
being scheduled for fencing, the fence operation is not succeeding.
You need to troubleshoot the root cause of your failed fencing action.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to