Hi, On Mon, May 10, 2010 at 09:21:03AM +0200, Nicola Sabatelli wrote: > Hi, > > I have solved my problem. > > I find a little problem in the script > ‘/usr/lib64/stonith/plugins/external/sbd’ when it retrieve the hosts list. > > I substitute this lines: > > > > nodes=$( > > if is_heartbeat; then > > crm_node -H -p > > else > > crm_node -p > > fi) > > > > Whit these: > > > > if is_heartbeat; then > > nodes=$(crm_node -H -p) > > else > > nodes=$(crm_node -p) > > fi
Fixed now. Cheers, Dejan > > > and now the resource ‘external/sbd’ function very well. > > > > > > > > Best regards, Nicola. > > > > _____ > > Da: Michael Brown [mailto:mich...@netdirect.ca] > Inviato: giovedì 29 aprile 2010 16.53 > A: n.sabate...@ct.rupar.puglia.it > Oggetto: Re: R: [Pacemaker] R: R: Stonith external/sbd problem > > > > Hrm, my limited knowledge is exhausted. Good luck! > > M. > > _____ > > From: Nicola Sabatelli > To: 'Michael Brown' > Sent: Thu Apr 29 10:36:15 2010 > Subject: R: [Pacemaker] R: R: Stonith external/sbd problem > > The response to a query > > /usr/sbin/sbd -d /dev/mapper/mpath1p1 list > > is > > 0 clover-a.rsr.rupar.puglia.it clear > > 1 clover-h.rsr.rupar.puglia.it clear > > > > > > Ciao, Nicola. > > _____ > > Da: Michael Brown [mailto:mich...@netdirect.ca] > Inviato: giovedì 29 aprile 2010 16.33 > A: The Pacemaker cluster resource manager > Cc: Nicola Sabatelli > Oggetto: Re: [Pacemaker] R: R: Stonith external/sbd problem > > > > FWIW, here's my setup for sbd on shared storage: > > in /etc/init.d/boot.local: > sbd -d /dev/disk/by-id/dm-uuid-part2-mpath-3600a0b8000266f7e000035414bd00428 > -D -W watch > > xenhost1:~ # sbd -d > /dev/disk/by-id/dm-uuid-part2-mpath-3600a0b8000266f7e000035414bd00428 list > 0 xenhost1 clear > 1 xenhost2 clear > > excerpt from 'crm configure show': > primitive sbd stonith:external/sbd \ > operations $id="sbd-operations" \ > op monitor interval="15" timeout="15" start-delay="15" \ > params > sbd_device="/dev/disk/by-id/dm-uuid-part2-mpath-3600a0b8000266f7e000035414bd00428" > clone sbd-clone sbd \ > meta interleave="true" > > What do you see if you run '/usr/sbin/sbd -d /dev/mapper/mpath1p1 list'? > > M. > > On 04/29/2010 10:23 AM, Nicola Sabatelli wrote: > > Yes, I create the disk and allocate the node, and I create a resource on > cluster in this way: > > <clone id="cl_external_sbd_1"> > > <meta_attributes id="cl_external_sbd_1-meta_attributes"> > > <nvpair id="cl_external_sbd_1-meta_attributes-clone-max" > name="clone-max" value="2"/> > > </meta_attributes> > > <primitive class="stonith" type="external/sbd" > id="stonith_external_sbd_LOCK_LUN"> > > <instance_attributes > id="stonith_external_sbd_LOCK_LUN-instance_attributes"> > > <nvpair id="nvpair-stonith_external_sbd_LOCK_LUN-sbd_device" > name="sbd_device" value="/dev/mapper/mpath1p1"/> > > </instance_attributes> > > <operations id="stonith_external_sbd_LOCK_LUN-operations"> > > <op id="op-stonith_external_sbd_LOCK_LUN-stop" interval="0" > name="stop" timeout="60"/> > > <op id="op-stonith_external_sbd_LOCK_LUN-monitor" interval="60" > name="monitor" start-delay="0" timeout="60"/> > > <op id="op-stonith_external_sbd_LOCK_LUN-start" interval="0" > name="start" timeout="60"/> > > </operations> > > <meta_attributes id="stonith_external_sbd_LOCK_LUN-meta_attributes"> > > <nvpair name="target-role" > id="stonith_external_sbd_LOCK_LUN-meta_attributes-target-role" > value="stopped"/> > > </meta_attributes> > > </primitive> > > </clone> > > > > > > Ciao, Nicola. > > _____ > > Da: Vit Pelcak [mailto:vpel...@suse.cz] > Inviato: giovedì 29 aprile 2010 16.08 > A: pacemaker@oss.clusterlabs.org > Oggetto: Re: [Pacemaker] R: Stonith external/sbd problem > > > > Also, it is needed to add stonith to cib: > > crm configure primitive sbd_stonith stonith:external/sbd meta > target-role="Started" op monitor interval="15" timeout="15" start-delay="15" > params sbd_device="/dev/sda1" > > > Dne 29.4.2010 15:46, Nicola Sabatelli napsal(a): > > I have done exactly the configuration in the SBD_Fencing documentation. > > That is: > > /etc/sysconfig/sbd > > SBD_DEVICE="/dev/mapper/mpath1p1" > > SBD_OPTS="-W" > > And I start the demon in this manner: > > /usr/sbin/sbd -d /dev/mapper/mpath1p1 -D -W watch > > Is correct? > > > > Ciao, Nicola. > > _____ > > Da: Vit Pelcak [mailto:vpel...@suse.cz] > Inviato: giovedì 29 aprile 2010 15.02 > A: pacemaker@oss.clusterlabs.org > Oggetto: Re: [Pacemaker] Stonith external/sbd problem > > > > cat /etc/sysconfig/sbd > > SBD_DEVICE="/dev/sda1" > SBD_OPTS="-W" > > > sbd -d /dev/shared_disk create > sbd -d /dev/shared_disk allocate your_machine > > > Dne 29.4.2010 14:55, Michael Brown napsal(a): > > Oh, I forgot a piece: I had simular trouble until I actually properly started > sbd and then it worked. > > M. > > _____ > > From: Michael Brown > To: pacemaker@oss.clusterlabs.org > Sent: Thu Apr 29 08:53:32 2010 > Subject: Re: [Pacemaker] Stonith external/sbd problem > > > > > I just set this up myself and it worked fine for me. > > Did you follow the guide? You need to configure the sbd daemon to run on > bootup with appropriate options before external/sbd can use it. > > M. > > _____ > > From: Nicola Sabatelli > To: pacemaker@oss.clusterlabs.org > Sent: Thu Apr 29 08:47:04 2010 > Subject: [Pacemaker] Stonith external/sbd problem > > > I have a problem with STONITH plugin external/sbd. > > I have configured the system in according to directive that I find at url > http://www.linux-ha.org/wiki/SBD_Fencing, and the device that I use is > configured with multipath software because this disk is residend on a storage > system. > > I have create a resurse on my cluster using clove directive. > > But when I try to start the resurse I have these errors: > > > > from ha-log file: > > > > Apr 29 14:37:51 clover-h stonithd: [16811]: info: external_run_cmd: Calling > '/usr/lib64/stonith/plugins/external/sbd status' returned 256 > > Apr 29 14:37:51 clover-h stonithd: [16811]: CRIT: external_status: 'sbd > status' failed with rc 256 > > Apr 29 14:37:51 clover-h stonithd: [10615]: WARN: start > stonith_external_sbd_LOCK_LUN:0 failed, because its hostlist is empty > > > > from crm_verify: > > > > crm_verify[18607]: 2010/04/29_14:39:27 info: main: =#=#=#=#= Getting XML > =#=#=#=#= > > crm_verify[18607]: 2010/04/29_14:39:27 info: main: Reading XML from: live > cluster > > crm_verify[18607]: 2010/04/29_14:39:27 notice: unpack_config: On loss of CCM > Quorum: Ignore > > crm_verify[18607]: 2010/04/29_14:39:27 info: unpack_config: Node scores: > 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 > > crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status: Node > clover-a.rsr.rupar.puglia.it is online > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing failed > op stonith_external_sbd_LOCK_LUN:1_start_0 on clover-a.rsr.rupar.puglia.it: > unknown error (1) > > crm_verify[18607]: 2010/04/29_14:39:27 info: find_clone: Internally renamed > stonith_external_sbd_LOCK_LUN:0 on clover-a.rsr.rupar.puglia.it to > stonith_external_sbd_LOCK_LUN:2 (ORPHAN) > > crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status: Node > clover-h.rsr.rupar.puglia.it is online > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing failed > op stonith_external_sbd_LOCK_LUN:0_start_0 on clover-h.rsr.rupar.puglia.it: > unknown error (1) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print: Master/Slave > Set: ms_drbd_1 > > crm_verify[18607]: 2010/04/29_14:39:27 notice: short_print: Stopped: [ > res_drbd_1:0 res_drbd_1:1 ] > > crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print: > res_Filesystem_TEST (ocf::heartbeat:Filesystem): Stopped > > crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print: > res_IPaddr2_ip_clover (ocf::heartbeat:IPaddr2): Stopped > > crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print: Clone Set: > cl_external_sbd_1 > > crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print: > stonith_external_sbd_LOCK_LUN:0 (stonith:external/sbd): Started > clover-h.rsr.rupar.puglia.it FAILED > > crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print: > stonith_external_sbd_LOCK_LUN:1 (stonith:external/sbd): Started > clover-a.rsr.rupar.puglia.it FAILED > > crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount: cl_external_sbd_1 > has failed 1000000 times on clover-h.rsr.rupar.puglia.it > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness: Forcing > cl_external_sbd_1 away from clover-h.rsr.rupar.puglia.it after 1000000 > failures (max=1000000) > > crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount: cl_external_sbd_1 > has failed 1000000 times on clover-a.rsr.rupar.puglia.it > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness: Forcing > cl_external_sbd_1 away from clover-a.rsr.rupar.puglia.it after 1000000 > failures (max=1000000) > > crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights: ms_drbd_1: > Rolling back scores from res_Filesystem_TEST > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > res_drbd_1:0 cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > res_drbd_1:1 cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights: ms_drbd_1: > Rolling back scores from res_Filesystem_TEST > > crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1: > Promoted 0 instances of a possible 1 to master > > crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1: > Promoted 0 instances of a possible 1 to master > > crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights: > res_Filesystem_TEST: Rolling back scores from res_IPaddr2_ip_clover > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > res_Filesystem_TEST cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > res_IPaddr2_ip_clover cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > stonith_external_sbd_LOCK_LUN:0 cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource > stonith_external_sbd_LOCK_LUN:1 cannot run anywhere > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource > res_drbd_1:0 (Stopped) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource > res_drbd_1:1 (Stopped) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource > res_Filesystem_TEST (Stopped) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave resource > res_IPaddr2_ip_clover (Stopped) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop resource > stonith_external_sbd_LOCK_LUN:0 (clover-h.rsr.rupar.puglia.it) > > crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop resource > stonith_external_sbd_LOCK_LUN:1 (clover-a.rsr.rupar.puglia.it) > > Warnings found during check: config may not be valid > > > > and from crm_mon: > > > > ============ > > Last updated: Thu Apr 29 14:39:57 2010 > > Stack: Heartbeat > > Current DC: clover-h.rsr.rupar.puglia.it > (e39bb201-2a6f-457a-a308-be6bfe71309c) - partition with quorum > > Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7 > > 2 Nodes configured, unknown expected votes > > 4 Resources configured. > > ============ > > > > Online: [ clover-h.rsr.rupar.puglia.it clover-a.rsr.rupar.puglia.it ] > > > > Clone Set: cl_external_sbd_1 > > stonith_external_sbd_LOCK_LUN:0 (stonith:external/sbd): Started > clover-h.rsr.rupar.puglia.it FAILED > > stonith_external_sbd_LOCK_LUN:1 (stonith:external/sbd): Started > clover-a.rsr.rupar.puglia.it FAILED > > > > Operations: > > * Node clover-a.rsr.rupar.puglia.it: > > stonith_external_sbd_LOCK_LUN:1: migration-threshold=1000000 > fail-count=1000000 > > + (24) start: rc=1 (unknown error) > > * Node clover-h.rsr.rupar.puglia.it: > > stonith_external_sbd_LOCK_LUN:0: migration-threshold=1000000 > fail-count=1000000 > > + (25) start: rc=1 (unknown error) > > > > Failed actions: > > stonith_external_sbd_LOCK_LUN:1_start_0 > (node=clover-a.rsr.rupar.puglia.it, call=24, rc=1, status=complete): unknown > error > > stonith_external_sbd_LOCK_LUN:0_start_0 > (node=clover-h.rsr.rupar.puglia.it, call=25, rc=1, status=complete): unknown > error > > > > > > > > > > Ciao, Nicola. > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > > > -- > Michael Brown | `One of the main causes of the fall of > Systems Consultant | the Roman Empire was that, lacking zero, > Net Direct Inc. | they had no way to indicate successful > ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf