Dear list, I try to setup a simple 3 Node cluster with a SplitBrainDetector partition (which resides temporarly on an USB Stick).
ietd.conf (iscsitarget) looks like this : Target iqn.2009-09.unibe.ch:myhost.unibe.ch Lun 1 Path=/dev/sda (which is the USB Stick),Type=fileio,ScsiId=WeissDochNich On "ctdb-1" which is iscsi initiator I did the following : ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd create <nooutput> ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd list <nooutput> ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd dump Header version : 2 Number of slots : 255 Sector size : 512 Timeout (watchdog) : 5 Timeout (allocate) : 2 Timeout (loop) : 1 Timeout (msgwait) : 10 But I am not able to use the command below : ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd message ctdb-1 test no error message appears. Just the syntax of sbd again. I followed the instructions on http://www.linux-ha.org/SBD_Fencing What am I doing wrong ? iSCSI target is hosted on Debian GNU/Linux 5.0.7 (lenny) # ietd -v iscsid version 0.4.16 The iscsi initiator nodes are running Ubuntu 10.10 The consequences of the above seems to be : ctdb-1:~# crm configure crm(live)configure# primitive STONED stonith:external/sbd params sbd_device=/dev/SplitBrainDetector/sbd WARNING: STONED: default timeout 20s for start is smaller than the advised 60 crm(live)configure# end There are changes pending. Do you want to commit them? y WARNING: CIB changed in the meantime: won't touch it! WARNING: STONED: default timeout 20s for start is smaller than the advised 60 Do you still want to commit? y crm(live)# ctdb-1:~$ sudo crm configure show node ctdb-1 node ctdb-2 node ctdb-3 \ attributes standby="off" primitive STONED stonith:external/sbd \ params sbd_device="/dev/SplitBrainDetector/sbd" property $id="cib-bootstrap-options" \ dc-version="1.0.9-unknown" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="true" \ stonith-timeout="30s" Now a bit troubleshooting : ctdb-1:~$ sudo crm_verify -LV crm_verify[3623]: 2011/01/12_16:15:31 ERROR: unpack_rsc_op: Hard error - STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-1 crm_verify[3623]: 2011/01/12_16:15:31 WARN: unpack_rsc_op: Processing failed op STONED_start_0 on ctdb-1: invalid parameter (2) crm_verify[3623]: 2011/01/12_16:15:31 ERROR: unpack_rsc_op: Hard error - STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-2 crm_verify[3623]: 2011/01/12_16:15:31 WARN: unpack_rsc_op: Processing failed op STONED_start_0 on ctdb-2: invalid parameter (2) crm_verify[3623]: 2011/01/12_16:15:31 WARN: common_apply_stickiness: Forcing STONED away from ctdb-1 after 1000000 failures (max=1000000) crm_verify[3623]: 2011/01/12_16:15:31 WARN: common_apply_stickiness: Forcing STONED away from ctdb-2 after 1000000 failures (max=1000000) crm_verify[3623]: 2011/01/12_16:15:31 WARN: stage6: Scheduling Node ctdb-3 for STONITH Warnings found during check: config may not be valid ctdb-1:~$ sudo crm_mon --one-shot --operations ============ Last updated: Wed Jan 12 16:25:27 2011 Stack: openais Current DC: ctdb-1 - partition with quorum Version: 1.0.9-unknown 3 Nodes configured, 3 expected votes 1 Resources configured. ============ Node ctdb-3: UNCLEAN (offline) Online: [ ctdb-1 ctdb-2 ] STONED (stonith:external/sbd): Started ctdb-2 FAILED Operations: * Node ctdb-1: STONED: migration-threshold=1000000 fail-count=1000000 + (3) start: rc=2 (invalid parameter) + (4) stop: rc=0 (ok) * Node ctdb-2: STONED: migration-threshold=1000000 fail-count=1000000 + (3) start: rc=2 (invalid parameter) Failed actions: STONED_start_0 (node=ctdb-1, call=3, rc=2, status=complete): invalid parameter STONED_start_0 (node=ctdb-2, call=3, rc=2, status=complete): invalid parameter ctdb-1:~$ sudo ptest --live-check -VVV ptest[3902]: 2011/01/12_16:26:13 ERROR: unpack_rsc_op: Hard error - STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-1 ptest[3902]: 2011/01/12_16:26:13 WARN: unpack_rsc_op: Processing failed op STONED_start_0 on ctdb-1: invalid parameter (2) ptest[3902]: 2011/01/12_16:26:13 ERROR: unpack_rsc_op: Hard error - STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-2 ptest[3902]: 2011/01/12_16:26:13 WARN: unpack_rsc_op: Processing failed op STONED_start_0 on ctdb-2: invalid parameter (2) ptest[3902]: 2011/01/12_16:26:13 notice: native_print: STONED (stonith:external/sbd): Started ctdb-2 FAILED ptest[3902]: 2011/01/12_16:26:13 WARN: common_apply_stickiness: Forcing STONED away from ctdb-1 after 1000000 failures (max=1000000) ptest[3902]: 2011/01/12_16:26:13 WARN: common_apply_stickiness: Forcing STONED away from ctdb-2 after 1000000 failures (max=1000000) ptest[3902]: 2011/01/12_16:26:13 WARN: stage6: Scheduling Node ctdb-3 for STONITH ptest[3902]: 2011/01/12_16:26:13 notice: LogActions: Stop resource STONED (ctdb-2) ctdb-1:~# ocf-tester -n STONED /usr/lib/stonith/plugins/external/sbd Beginning tests for /usr/lib/stonith/plugins/external/sbd... * rc=1: Your agent has too restrictive permissions: should be 755 -:1: parser error : Document is empty ^ -:1: parser error : Start tag expected, '<' not found ^ I/O error : Invalid seek * rc=1: Your agent produces meta-data which does not conform to ra-api-1.dtd * rc=1: The meta-data action cannot fail and must return 0 * rc=1: Validation failed. Did you supply enough options with -o ? Aborting tests What am I doing wrong ? kind regrds, --- Janosh _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
