On 09/10/2013, at 1:53 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:

> I think I know why this happened after I enabled 'verbose' for fence_ipmilan. 
> When I firstly configure stonith, I set lanplus as true, however, my machine 
> is not HP one so lanplus is not supported. When I notice this, I use 'crm 
> configure load update' to update the stonith to set lanplus as false. And it 
> seems pacemaker accepted this. I think this means stonith-ng will just use 
> new ipmitool command line since then.
> However, the strange behavior is that this configuration never took 
> effective, even after I restarted the pacemaker service for several times.

Thats quite odd, I've never heard that before.

> What I finally resolved this is that I deleted all configured resource 
> one-by-one, and then configure the whole stuff again.
> P.S. the pacemaker version is pacemaker-cli-1.1.6-3.el6.x86_64, and 
> fence-agents-3.1.5-10.el6.x86_64
> Is it a resolved bug in newer version?

Highly likely

> Thanks.
> 
> 
> 
> On Wed, Oct 9, 2013 at 5:09 AM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:
> Hi:
> I configure stonith on CentOS 6.2 with fence-ipmilan agent:
> primitive node2-stonith stonith:fence_ipmilan \
>         params pcmk_host_list="node2" pcmk_host_check="static-list" 
> ipaddr="192.168.170.1" login="root" passwd="123" lanplus="false" 
> power_wait="1"
> 
> The IPaddr for IPMI and credentials are verified to be correct with raw 
> ipmitool command.
> 
> While I test the stonith, I just found that the node1-stonith seem not 
> working at all, and I also found some strange log on another node which is 
> expected to kill node1:
> 
> Oct  9 04:39:05 node1 stonith-ng: [3705]: info: stonith_fence: Exec 
> <stonith_command t="stonith-ng" 
> st_async_id="4ca92d0e-9a2a-4fdd-8968-c91eb89e8cbe" st_op="st_fence" 
> st_callid="0" st_callopt="0" 
> st_remote_op="4ca92d0e-9a2a-4fdd-8968-c91eb89e8cbe" st_target="node2" 
> st_device_action="reboot" st_timeout="54000" src="node3" seq="12" />
> Oct  9 04:39:05 node1 stonith-ng: [3705]: info: can_fence_host_with_device: 
> node2-stonith can fence node2: static-list
> Oct  9 04:39:05 node1 stonith-ng: [3705]: info: stonith_fence: Found 1 
> matching devices for 'node2'
> Oct  9 04:39:05 node1 stonith-ng: [3705]: info: stonith_command: Processed 
> st_fence from node3: rc=-1
> Oct  9 04:39:05 node1 stonith-ng: [3705]: info: make_args: reboot-ing node 
> 'node2' as 'port=node2'
> Oct  9 04:39:05 node1 crmd: [3710]: info: send_direct_ack: ACK'ing resource 
> op drbd_hadoop:1_notify_0 from 77:4:0:ee8de687-92c9-4123-8efb-befd45814a3b: 
> lrm_invoke-lrmd-1381264745-30
> Oct  9 04:39:05 node1 crmd: [3710]: info: process_lrm_event: LRM operation 
> drbd_hadoop:1_notify_0 (call=20, rc=0, cib-update=0, confirmed=true) ok
> Oct  9 04:39:05 node1 stonith-ng: [3705]: ERROR: log_operation: Operation 
> 'reboot' [22346] (call 0 from (null)) for host 'node2' with device 
> 'node2-stonith' returned: -2
> Oct  9 04:39:05 node1 stonith-ng: [3705]: ERROR: log_operation: 
> node2-stonith: Rebooting machine @ IPMI:192.168.170.1...Failed
> 
> The log shows that stonith failed with return value (-2). However, what does 
> this mean? Is there any configure issue?
> Thanks.
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to