I'm currently trying to use the stonith device external/ipmi through pacemaker and I'm having some problems.
Running the stonith command directly reports success:
stonith -St external/ipmi -p "pcw3572.see.ed.ac.uk 129.215.215.144 root
password lanplus"
stonith: external/ipmi device OK
(and I get errors if I put in a wrong password etc, so the connection is
definitely working).
I use the following cib config:
primitive stonithipmidisk1 stonith:external/ipmi \
params hostname="pcw3572.see.ed.ac.uk" ipaddr="129.215.215.144"
userid="root" passwd="password" interface="lanplus"
and the stonith device starts correctly.
When I 'kill' the node (killall -9 corosync) the node state switches to
UNCLEAN (offline). However, the stonith device never causes the node to
be powered off
I see the following in the logs:
Apr 7 16:00:47 pcw3571 crmd: [8318]: info: ais_dispatch_message:
Membership 2568: quorum retained
Apr 7 16:00:47 pcw3571 cib: [8314]: info: ais_dispatch_message:
Membership 2568: quorum retained
Apr 7 16:00:47 pcw3571 cib: [8314]: info: crm_update_peer: Node
pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
ip(129.215.215.141) r(1) ip(192.168.140.2) votes=1 born=2564 seen=2564
proc=00000000000000000000000000111312
Apr 7 16:00:47 pcw3571 crmd: [8318]: info: ais_status_callback: status:
pcw3572.see.ed.ac.uk is now lost (was member)
Apr 7 16:00:47 pcw3571 crmd: [8318]: info: crm_update_peer: Node
pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
ip(129.215.215.141) r(1) ip(192.168.140.2) votes=1 born=2564 seen=2564
proc=00000000000000000000000000111312
Apr 7 16:00:47 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
<stonith_command t="stonith-ng"
st_async_id="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9" st_op="st_query"
st_callid="0" st_callopt="0"
st_remote_op="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9"
st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
src="pcw3574.see.ed.ac.uk" seq="51" />
Apr 7 16:00:48 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: Refreshing port list for stonithipmidisk1
Apr 7 16:00:48 pcw3571 stonith-ng: [8313]: WARN: parse_host_line: Could
not parse (0 0):
Apr 7 16:00:48 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: stonithipmidisk1 can not fence
pcw3572.see.ed.ac.uk: dynamic-list
Apr 7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
matching devices for 'pcw3572.see.ed.ac.uk'
Apr 7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_command:
Processed st_query from pcw3574.see.ed.ac.uk: rc=0
Apr 7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
<stonith_command t="stonith-ng"
st_async_id="5432b828-847f-4f7b-9a91-420f90a86e61" st_op="st_query"
st_callid="0" st_callopt="0"
st_remote_op="5432b828-847f-4f7b-9a91-420f90a86e61"
st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
src="pcw3574.see.ed.ac.uk" seq="53" />
Apr 7 16:00:53 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: stonithipmidisk1 can not fence
pcw3572.see.ed.ac.uk: dynamic-list
Apr 7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
matching devices for 'pcw3572.see.ed.ac.uk'
Apr 7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_command:
Processed st_query from pcw3574.see.ed.ac.uk: rc=0
Can anyone give me some pointers on what is going wrong here?
Thanks,
Matthew
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
