On Thu, Apr 7, 2011 at 4:12 AM, Richard <[email protected]> wrote:
> Hi,
>    I'm rather new to opanais and have run into some issues with the order of
> fencing plus refusal to failover once one fencing method fails. Any help
> would be much appreciated.
>    Even though I've set priority lower on my fence_node2_ipmi device it will
> not fence first. But fence_node2_apc is picked (also tried setting the
> priority the other way, no affect). Only when I delete fence_node2_ipmi and
> add it again does it get used first.

stonith in 1.1.x doesn't strictly observe the priorities.
its one of the things we need to fix soon.

> The second issue i'm running into is
> that if fence_node2_ipmi fails OR fence_node2_apc for that matter it just
> keeps reattempting that same fencing device over and over again.

Its shouldn't do that - can you file a bug and include a crm_report
archive please?

> Also every time it executes the reboot the physical ipmi card is issuing a
> restart and the server endlessly rebooting.
> Log output:
> Apr 06 19:05:39 node1 stonith-ng: [2591]: info: log_data_element:
> process_remote_stonith_exec: ExecResult <st-reply
> st_origin="stonith_construct_async_reply" t="stonith-ng" st_op="st_notify"
> st_remote_op="2311741c-fc3b-4094-badd-0ac9e10a209b" st_callid="0"
> st_callopt="0" st_rc="1" st_output="Rebooting machine @
> IPMI:192.168.1.161...Failed
> " src="node1" seq="268" />
> Apr 06 19:05:49 node1 stonith-ng: [2591]: ERROR: remote_op_timeout: Action
> reboot (2311741c-fc3b-4094-badd-0ac9e10a209b) for node2 timed out
> Apr 06 19:05:49 node1 stonith-ng: [2591]: info: remote_op_done: Notifing
> clients of 2311741c-fc3b-4094-badd-0ac9e10a209b (reboot of node2 from
> fc25a065-3355-455d-937f-360b07f9dda9 by (null)): 1, rc=-7
> Apr 06 19:05:49 node1 stonith-ng: [2591]: info: stonith_notify_client:
> Sending st_fence-notification to client
> 2596/17cafaec-7078-4972-937e-1cf5636c8523
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: initiate_remote_stonith_op:
> Initiating remote operation reboot for node2:
> e5e1a936-c038-42bc-acff-18c2a41e9ae2
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: log_data_element:
> stonith_query: Query <stonith_command t="stonith-ng"
> st_async_id="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_op="st_query"
> st_callid="0" st_callopt="0"
> st_remote_op="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_target="node2"
> st_device_action="reboot" st_clientid="fc25a065-3355-455d-937f-360b07f9dda9"
> src="node1" seq="269" />
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device:
> fence_node2_ipmi can fence node2: static-list
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device:
> fence_node2_apc can fence node2: static-list
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: stonith_query: Found 2
> matching devices for 'node2'
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: call_remote_stonith:
> Requesting that node1 perform op reboot node2
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: log_data_element:
> stonith_fence: Exec <stonith_command t="stonith-ng"
> st_async_id="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_op="st_fence"
> st_callid="0" st_callopt="0"
> st_remote_op="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_target="node2"
> st_device_action="reboot" src="node1" seq="271" />
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device:
> fence_node2_ipmi can fence node2: static-list
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device:
> fence_node2_apc can fence node2: static-list
> Apr 06 19:05:50 node1 stonith-ng: [2591]: info: stonith_fence: Found 2
> matching devices for 'node2'
>
>    I'm running version: 1.1.2.
>
>   Here is the relevant part of my cluster config:
> node node1 \
>         attributes standby="off"
> node node2 \
>         attributes standby="off"
> primitive fence_node1 stonith:fence_ipmilan \
>         params action="reboot" ipaddr="192.168.1.160" login="ADMIN"
> passwd="ADMIN" pcmk_host_check="static-list" pcmk_host_list="node1"
> primitive fence_node1_apc stonith:fence_apc_snmp \
>         params ipaddr="192.168.1.180" action="reboot" port="node1"
> community="private" pcmk_host_check="static-list" pcmk_host_list="node1"
> priority="20"
> primitive fence_node2_apc stonith:fence_apc_snmp \
>         params ipaddr="192.168.1.180" action="reboot" port="node2"
> community="private" pcmk_host_check="static-list" pcmk_host_list="node2"
> priority="100"
> primitive fence_node2_ipmi stonith:fence_ipmilan \
>         params action="reboot" ipaddr="192.168.1.161" login="ADMIN"
> passwd="ADMIN" pcmk_host_check="static-list" pcmk_host_list="node2"
> priority="10"
> location fence-node1_apc-on-node2 fence_node1_apc -inf: node1
> location fence_node1-on-node2 fence_node1 -inf: node1
> location fence_node2-on-node1 fence_node2_apc -inf: node2
> location fence_node2_ipmi-on-node1 fence_node2_ipmi -inf: node2
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="true" \
>         no-quorum-policy="ignore" \
>         stonith-timeout="30s"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="100"
>
>
>
> Best Regards,
> Richard Cernava
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
>
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to