Hi,

On Thu, Apr 07, 2011 at 04:23:17PM +0100, Matthew Richardson wrote:
> I'm currently trying to use the stonith device external/ipmi through
> pacemaker and I'm having some problems.
> 
> Running the stonith command directly reports success:
> 
> stonith -St external/ipmi -p "pcw3572.see.ed.ac.uk 129.215.215.144 root
> password lanplus"
> stonith: external/ipmi device OK
> 
> (and I get errors if I put in a wrong password etc, so the connection is
> definitely working).
> 
> I use the following cib config:
> 
> primitive stonithipmidisk1 stonith:external/ipmi \
>       params hostname="pcw3572.see.ed.ac.uk" ipaddr="129.215.215.144"
> userid="root" passwd="password" interface="lanplus"
> 
> and the stonith device starts correctly.
> 
> When I 'kill' the node (killall -9 corosync) the node state switches to
> UNCLEAN (offline).  However, the stonith device never causes the node to
> be powered off
> 
> I see the following in the logs:
> 
> Apr  7 16:00:47 pcw3571 crmd: [8318]: info: ais_dispatch_message:
> Membership 2568: quorum retained
> Apr  7 16:00:47 pcw3571 cib: [8314]: info: ais_dispatch_message:
> Membership 2568: quorum retained
> Apr  7 16:00:47 pcw3571 cib: [8314]: info: crm_update_peer: Node
> pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
> ip(129.215.215.141) r(1) ip(192.168.140.2)  votes=1 born=2564 seen=2564
> proc=00000000000000000000000000111312
> Apr  7 16:00:47 pcw3571 crmd: [8318]: info: ais_status_callback: status:
> pcw3572.see.ed.ac.uk is now lost (was member)
> Apr  7 16:00:47 pcw3571 crmd: [8318]: info: crm_update_peer: Node
> pcw3572.see.ed.ac.uk: id=2379732865 state=lost (new) addr=r(0)
> ip(129.215.215.141) r(1) ip(192.168.140.2)  votes=1 born=2564 seen=2564
> proc=00000000000000000000000000111312
> Apr  7 16:00:47 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
> <stonith_command t="stonith-ng"
> st_async_id="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9" st_op="st_query"
> st_callid="0" st_callopt="0"
> st_remote_op="b8ed0eda-b271-4e1f-84c6-cd2d99a1e7c9"
> st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
> st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
> src="pcw3574.see.ed.ac.uk" seq="51" />
> Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info:
> can_fence_host_with_device: Refreshing port list for stonithipmidisk1
> Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: WARN: parse_host_line: Could
> not parse (0 0):
> Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info:
> can_fence_host_with_device: stonithipmidisk1 can not fence
> pcw3572.see.ed.ac.uk: dynamic-list
> Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
> matching devices for 'pcw3572.see.ed.ac.uk'
> Apr  7 16:00:48 pcw3571 stonith-ng: [8313]: info: stonith_command:
> Processed st_query from pcw3574.see.ed.ac.uk: rc=0
> Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_queryQuery
> <stonith_command t="stonith-ng"
> st_async_id="5432b828-847f-4f7b-9a91-420f90a86e61" st_op="st_query"
> st_callid="0" st_callopt="0"
> st_remote_op="5432b828-847f-4f7b-9a91-420f90a86e61"
> st_target="pcw3572.see.ed.ac.uk" st_device_action="reboot"
> st_clientid="d98c3433-910b-48c7-b542-6abab4deb0b1" st_timeout="6000"
> src="pcw3574.see.ed.ac.uk" seq="53" />
> Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info:
> can_fence_host_with_device: stonithipmidisk1 can not fence
> pcw3572.see.ed.ac.uk: dynamic-list
> Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_query: Found 0
> matching devices for 'pcw3572.see.ed.ac.uk'
> Apr  7 16:00:53 pcw3571 stonith-ng: [8313]: info: stonith_command:
> Processed st_query from pcw3574.see.ed.ac.uk: rc=0
> 
> 
> Can anyone give me some pointers on what is going wrong here?

For whatever reason stonith-ng doesn't think that
stonithipmidisk1 can manage this node. Which version of
Pacemaker do you run? Perhaps this has been fixed in the
meantime. I cannot recall right now if there has been such a
problem, but it's possible. You can also try to turn debug on
and see if there are more clues.

Thanks,

Dejan

> Thanks,
> 
> Matthew
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 



> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to