On 2 Nov 2010 10:59, Dejan Muhamedagic wrote: > >> Then, I said 'kill -9 <corosync_pid> ' on node2, and stonith on node1 >> really initiated a REBOOT of node2! >> >> BUT in /var/log/messages of node1, stonith-ng thinks that the operation >> failed: >> >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (0 2): ** (process:12139): DEBUG: rcd_serial_set_config:called >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (3 19): (process:12139): DEBUG: rcd_serial_set_config:called >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (0 0): >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (0 2): ** (process:12141): DEBUG: rcd_serial_set_config:called >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (3 19): (process:12141): DEBUG: rcd_serial_set_config:called >> Oct 29 16:06:55 node1 stonith-ng: [31449]: WARN: parse_host_line: Could >> not parse (0 0): >> Oct 29 16:06:55 node1 pengine: [31454]: WARN: process_pe_message: >> Transition 29: WARNINGs found during PE processing. PEngine Input stored >> in: /var/lib/pengine/pe-warn-10.bz2 >> Oct 29 16:06:55 node1 stonith: rcd_serial device not accessible. >> > Can't recall seeing this in the logs you posted earlier. This > seems to be a genuine error, perhaps due to some particular > circumstances. > > >> Oct 29 16:06:55 node1 stonith-ng: [31449]: notice: log_operation: >> Operation 'monitor' [12143] for device 'stonith2' returned: 1 >> Oct 29 16:06:55 node1 crmd: [31455]: WARN: status_from_rc: Action 118 >> (stonith2_monitor_60000) on node1 failed (target: 0 vs. rc: 1): Error >> Oct 29 16:06:55 node1 crmd: [31455]: WARN: update_failcount: Updating >> failcount for stonith2 on node1 after failed monitor: rc=1 >> (update=value++, time=1288361215) >> Oct 29 16:06:57 node1 kernel: [23312.814010] r8169 0000:02:00.0: eth0: >> link down >> Oct 29 16:06:57 node1 stonith-ng: [31449]: ERROR: log_operation: >> Operation 'reboot' [12142] for host 'node2' with device 'stonith1' >> returned: 1 (call 0 from (null)) >> > When you ran -T reset on the command line, did you pay attention > to the exit code returned by the command? Did it exit with code 1 > or 0? To me it looked like it exited with 0, but can you please > check that (echo $?). Please also check that the "stonith ... -S" > exits with 0. > > If both exit with 0 and report the right thing, then please run > again the test with the cluster. Make sure first that the monitor > operation on the stonith resources succeeds. Then try the fencing > operation. If either of them fails again, then please open a bug > report and include hb_report. > > Thanks, > > Dejan > Hi,
here is what you requested: TEST 1: stonith -t rcd_serial -p "test /dev/ttyS0 rts 2000" test ** (process:2928): DEBUG: rcd_serial_set_config:called Alarm clock # echo $? 142 TEST 2: stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts" msduration="2000" -S ** (process:6851): DEBUG: rcd_serial_set_config:called stonith: rcd_serial device OK. # echo $? 0 TEST 3: stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts" msduration="2000" -T reset node2 ** (process:8142): DEBUG: rcd_serial_set_config:called Alarm clock # echo $? 142 TEST 1 as well as TEST 2 caused a reboot of node2! Thanks, Eberhard ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker