On 09/10/2013, at 1:53 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:
> I think I know why this happened after I enabled 'verbose' for fence_ipmilan. > When I firstly configure stonith, I set lanplus as true, however, my machine > is not HP one so lanplus is not supported. When I notice this, I use 'crm > configure load update' to update the stonith to set lanplus as false. And it > seems pacemaker accepted this. I think this means stonith-ng will just use > new ipmitool command line since then. > However, the strange behavior is that this configuration never took > effective, even after I restarted the pacemaker service for several times. Thats quite odd, I've never heard that before. > What I finally resolved this is that I deleted all configured resource > one-by-one, and then configure the whole stuff again. > P.S. the pacemaker version is pacemaker-cli-1.1.6-3.el6.x86_64, and > fence-agents-3.1.5-10.el6.x86_64 > Is it a resolved bug in newer version? Highly likely > Thanks. > > > > On Wed, Oct 9, 2013 at 5:09 AM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote: > Hi: > I configure stonith on CentOS 6.2 with fence-ipmilan agent: > primitive node2-stonith stonith:fence_ipmilan \ > params pcmk_host_list="node2" pcmk_host_check="static-list" > ipaddr="192.168.170.1" login="root" passwd="123" lanplus="false" > power_wait="1" > > The IPaddr for IPMI and credentials are verified to be correct with raw > ipmitool command. > > While I test the stonith, I just found that the node1-stonith seem not > working at all, and I also found some strange log on another node which is > expected to kill node1: > > Oct 9 04:39:05 node1 stonith-ng: [3705]: info: stonith_fence: Exec > <stonith_command t="stonith-ng" > st_async_id="4ca92d0e-9a2a-4fdd-8968-c91eb89e8cbe" st_op="st_fence" > st_callid="0" st_callopt="0" > st_remote_op="4ca92d0e-9a2a-4fdd-8968-c91eb89e8cbe" st_target="node2" > st_device_action="reboot" st_timeout="54000" src="node3" seq="12" /> > Oct 9 04:39:05 node1 stonith-ng: [3705]: info: can_fence_host_with_device: > node2-stonith can fence node2: static-list > Oct 9 04:39:05 node1 stonith-ng: [3705]: info: stonith_fence: Found 1 > matching devices for 'node2' > Oct 9 04:39:05 node1 stonith-ng: [3705]: info: stonith_command: Processed > st_fence from node3: rc=-1 > Oct 9 04:39:05 node1 stonith-ng: [3705]: info: make_args: reboot-ing node > 'node2' as 'port=node2' > Oct 9 04:39:05 node1 crmd: [3710]: info: send_direct_ack: ACK'ing resource > op drbd_hadoop:1_notify_0 from 77:4:0:ee8de687-92c9-4123-8efb-befd45814a3b: > lrm_invoke-lrmd-1381264745-30 > Oct 9 04:39:05 node1 crmd: [3710]: info: process_lrm_event: LRM operation > drbd_hadoop:1_notify_0 (call=20, rc=0, cib-update=0, confirmed=true) ok > Oct 9 04:39:05 node1 stonith-ng: [3705]: ERROR: log_operation: Operation > 'reboot' [22346] (call 0 from (null)) for host 'node2' with device > 'node2-stonith' returned: -2 > Oct 9 04:39:05 node1 stonith-ng: [3705]: ERROR: log_operation: > node2-stonith: Rebooting machine @ IPMI:192.168.170.1...Failed > > The log shows that stonith failed with return value (-2). However, what does > this mean? Is there any configure issue? > Thanks. > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org