Hi, On Mon, Oct 22, 2012 at 11:06:07AM +0200, Robbert Muller wrote: > Hello, > > While testing a new cluster we found the following behavior which i > discussed on #linux-ha with "andreask" afterwards and we both agree the > behavior was wrong. > > bug scenario: > 3 node cluster, 1 standby just for having 3 nodes, 2 active nodes > when we did a power off of the machine ( similar to pulling the power > cable from a machine ) the cluster failed to failover to the next node. > > This is because the following setting: > RESETPOWERON was set to 0, so a machine powered off stays powered off
Just to make sure: RESETPOWERON was set to 0 in the configuration? > with the current code path, a machine in the state poweroff is > considered a failure for the stonith reset operation. which results in > no resources are started on the second node, and the machine stays in a > unclean state. > > The analogy with real hardware and a powerbar and imho correct behavior: > --- > If i pull the plug of node1, node 2 will fence it with the powerbar. The > power will powercycle the socket without any result, because i pulled > the plug. But the fencing operation is a success and all resources are > started on the second node > --- > > Patch to fix this with i hope a minimal change is attached. Thanks for the patch. But we'll need to rework it a bit. > After finding this bug i got ill and have to stay at home for a few > days, so i don't have access to an environment to test this patch atm. Get better soon! Cheers, Dejan > Regards > > Robbert Müller > > > > > diff -r 66f7442698e6 lib/plugins/stonith/external/vcenter > --- a/lib/plugins/stonith/external/vcenter Mon Oct 15 15:59:57 2012 +0200 > +++ b/lib/plugins/stonith/external/vcenter Mon Oct 22 10:38:09 2012 +0200 > @@ -199,6 +199,8 @@ > if ($powerState eq "poweredOff" > && (! exists $ENV{'RESETPOWERON'} || $ENV{'RESETPOWERON'} ne 0)) { > $vm->PowerOnVM(); > system("ha_log.sh", > "info", "Machine $esx:$vm->{'name'} has been powered on"); > + } elsif( $powerState eq > "poweredOff" ) { > + system("ha_log.sh", > "info", "Machine $esx:$vm->{'name'} is poweredoff and RESETPOWERON was > disabled"); > } else { > dielog("Could not > complete $esx:$vm->{'name'} power cycle"); > } > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/