I'm currently trying to use the stonith device external/ipmi through
pacemaker and I'm having some problems.

Running the stonith command directly reports success:

stonith -St external/ipmi -p "pcw3572.see.ed.ac.uk 129.215.215.144 root
password lanplus"
stonith: external/ipmi device OK

(and I get errors if I put in a wrong password etc, so the connection is
definitely working).

I use the following cib config:

primitive stonithipmidisk1 stonith:external/ipmi \
        params hostname="pcw3572.see.ed.ac.uk" ipaddr="129.215.215.144"
userid="root" passwd="password" interface="lanplus"

and the stonith device starts correctly.

When I 'kill' the node (killall -9 corosync) the node state switches to
UNCLEAN (offline).  However, the stonith device never causes the node to
be powered off

I see the following in the logs:

Apr  7 16:00:47 pcw3571 crmd: [8318]: info: ais_dispatch_message:
Membership 2568: quorum retained
Apr  7 16:00:47 pcw3571 cib: [8314]: info: ais_dispatch_message:
Membership 2568: quorum retained
Apr  7 16:00:47 pcw3571 cib: [8314]: info: crm_update_peer: Node
pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
ip(129.215.215.141) r(1) ip(192.168.140.2)  votes=1 born=2564 seen=2564
proc=00000000000000000000000000111312
Apr  7 16:00:47 pcw3571 crmd: [8318]: info: ais_status_callback: status:
pcw3572.see.ed.ac.uk is now lost (was member)
Apr  7 16:00:47 pcw3571 crmd: [8318]: info: crm_update_peer: Node
pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
ip(129.215.215.141) r(1) ip(192.168.140.2)  votes=1 born=2564 seen=2564
proc=00000000000000000000000000111312
Apr  7 16:00:47 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
<stonith_command t="stonith-ng"
st_async_id="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9" st_op="st_query"
st_callid="0" st_callopt="0"
st_remote_op="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9"
st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
src="pcw3574.see.ed.ac.uk" seq="51" />
Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: Refreshing port list for stonithipmidisk1
Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: WARN: parse_host_line: Could
not parse (0 0):
Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: stonithipmidisk1 can not fence
pcw3572.see.ed.ac.uk: dynamic-list
Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
matching devices for 'pcw3572.see.ed.ac.uk'
Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_command:
Processed st_query from pcw3574.see.ed.ac.uk: rc=0
Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
<stonith_command t="stonith-ng"
st_async_id="5432b828-847f-4f7b-9a91-420f90a86e61" st_op="st_query"
st_callid="0" st_callopt="0"
st_remote_op="5432b828-847f-4f7b-9a91-420f90a86e61"
st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
src="pcw3574.see.ed.ac.uk" seq="53" />
Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info:
can_fence_host_with_device: stonithipmidisk1 can not fence
pcw3572.see.ed.ac.uk: dynamic-list
Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
matching devices for 'pcw3572.see.ed.ac.uk'
Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_command:
Processed st_query from pcw3574.see.ed.ac.uk: rc=0


Can anyone give me some pointers on what is going wrong here?

Thanks,

Matthew

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to