Hi,

   I'm rather new to opanais and have run into some issues with the order of 
fencing plus refusal to failover once one fencing method fails. Any help would 
be much appreciated.

   Even though I've set priority lower on my fence_node2_ipmi device it will 
not fence first. But fence_node2_apc is picked (also tried setting the priority 
the other way, no affect). Only when I delete fence_node2_ipmi and add it again 
does it get used first. The second issue i'm running into is that if 
fence_node2_ipmi fails OR fence_node2_apc for that matter it just keeps 
reattempting that same fencing device over and over again.

Also every time it executes the reboot the physical ipmi card is issuing a 
restart and the server endlessly rebooting.

Log output:

Apr 06 19:05:39 node1 stonith-ng: [2591]: info: log_data_element: 
process_remote_stonith_exec: ExecResult <st-reply 
st_origin="stonith_construct_async_reply" t="stonith-ng" st_op="st_notify" 
st_remote_op="2311741c-fc3b-4094-badd-0ac9e10a209b" st_callid="0" 
st_callopt="0" st_rc="1" st_output="Rebooting machine @ 
IPMI:192.168.1.161...Failed
" src="node1" seq="268" />
Apr 06 19:05:49 node1 stonith-ng: [2591]: ERROR: remote_op_timeout: Action 
reboot (2311741c-fc3b-4094-badd-0ac9e10a209b) for node2 timed out
Apr 06 19:05:49 node1 stonith-ng: [2591]: info: remote_op_done: Notifing 
clients of 2311741c-fc3b-4094-badd-0ac9e10a209b (reboot of node2 from 
fc25a065-3355-455d-937f-360b07f9dda9 by (null)): 1, rc=-7
Apr 06 19:05:49 node1 stonith-ng: [2591]: info: stonith_notify_client: Sending 
st_fence-notification to client 2596/17cafaec-7078-4972-937e-1cf5636c8523
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: initiate_remote_stonith_op: 
Initiating remote operation reboot for node2: 
e5e1a936-c038-42bc-acff-18c2a41e9ae2
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: log_data_element: 
stonith_query: Query <stonith_command t="stonith-ng" 
st_async_id="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_op="st_query" 
st_callid="0" st_callopt="0" 
st_remote_op="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_target="node2" 
st_device_action="reboot" st_clientid="fc25a065-3355-455d-937f-360b07f9dda9" 
src="node1" seq="269" />
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device: 
fence_node2_ipmi can fence node2: static-list
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device: 
fence_node2_apc can fence node2: static-list
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: stonith_query: Found 2 matching 
devices for 'node2'
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: call_remote_stonith: Requesting 
that node1 perform op reboot node2
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: log_data_element: 
stonith_fence: Exec <stonith_command t="stonith-ng" 
st_async_id="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_op="st_fence" 
st_callid="0" st_callopt="0" 
st_remote_op="e5e1a936-c038-42bc-acff-18c2a41e9ae2" st_target="node2" 
st_device_action="reboot" src="node1" seq="271" />
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device: 
fence_node2_ipmi can fence node2: static-list
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: can_fence_host_with_device: 
fence_node2_apc can fence node2: static-list
Apr 06 19:05:50 node1 stonith-ng: [2591]: info: stonith_fence: Found 2 matching 
devices for 'node2'

  
   I'm running version: 1.1.2.
  
  Here is the relevant part of my cluster config:

node node1 \
        attributes standby="off"
node node2 \
        attributes standby="off"
primitive fence_node1 stonith:fence_ipmilan \
        params action="reboot" ipaddr="192.168.1.160" login="ADMIN" 
passwd="ADMIN" pcmk_host_check="static-list" pcmk_host_list="node1"
primitive fence_node1_apc stonith:fence_apc_snmp \
        params ipaddr="192.168.1.180" action="reboot" port="node1" 
community="private" pcmk_host_check="static-list" pcmk_host_list="node1" 
priority="20"
primitive fence_node2_apc stonith:fence_apc_snmp \
        params ipaddr="192.168.1.180" action="reboot" port="node2" 
community="private" pcmk_host_check="static-list" pcmk_host_list="node2" 
priority="100"
primitive fence_node2_ipmi stonith:fence_ipmilan \
        params action="reboot" ipaddr="192.168.1.161" login="ADMIN" 
passwd="ADMIN" pcmk_host_check="static-list" pcmk_host_list="node2" 
priority="10"
location fence-node1_apc-on-node2 fence_node1_apc -inf: node1
location fence_node1-on-node2 fence_node1 -inf: node1
location fence_node2-on-node1 fence_node2_apc -inf: node2
location fence_node2_ipmi-on-node1 fence_node2_ipmi -inf: node2
property $id="cib-bootstrap-options" \
        dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="true" \
        no-quorum-policy="ignore" \
        stonith-timeout="30s"
rsc_defaults $id="rsc-options" \
        resource-stickiness="100"




Best Regards,
Richard Cernava
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to