Hi,

On Mon, Oct 22, 2012 at 11:06:07AM +0200, Robbert Muller wrote:
> Hello,
> 
> While testing a new cluster we found the following behavior which i
> discussed on #linux-ha with "andreask" afterwards and we both agree the
> behavior was wrong.
> 
> bug scenario:
> 3 node cluster, 1 standby just for having 3 nodes, 2 active nodes
> when we did a power off of the machine ( similar to pulling the power
> cable from a machine ) the cluster failed to failover to the next node.
> 
> This is because the following setting:
> RESETPOWERON was set to 0, so a machine powered off stays powered off

Just to make sure: RESETPOWERON was set to 0 in the configuration?

> with the current code path, a machine in the state poweroff is
> considered a failure for the stonith reset operation. which results in
> no resources are started on the second node, and the machine stays in a
> unclean state.
> 
> The analogy with real hardware and a powerbar and imho correct behavior:
> ---
> If i pull the plug of node1, node 2 will fence it with the powerbar. The
> power will powercycle the socket without any result, because i pulled
> the plug. But the fencing operation is a success and all resources are
> started on the second node
> ---
> 
> Patch to fix this with i hope a minimal change is attached.

Thanks for the patch. But we'll need to rework it a bit.

> After finding this bug i got ill and have to stay at home for a few
> days, so i don't have access to an environment to test this patch atm.

Get better soon!

Cheers,

Dejan

> Regards
> 
> Robbert Müller
> 
> 
> 
> 

> diff -r 66f7442698e6 lib/plugins/stonith/external/vcenter
> --- a/lib/plugins/stonith/external/vcenter    Mon Oct 15 15:59:57 2012 +0200
> +++ b/lib/plugins/stonith/external/vcenter    Mon Oct 22 10:38:09 2012 +0200
> @@ -199,6 +199,8 @@
>                                               if ($powerState eq "poweredOff" 
> && (! exists $ENV{'RESETPOWERON'} || $ENV{'RESETPOWERON'} ne 0)) {
>                                                       $vm->PowerOnVM();
>                                                       system("ha_log.sh", 
> "info", "Machine $esx:$vm->{'name'} has been powered on");
> +                                             } elsif( $powerState eq 
> "poweredOff" ) {
> +                                                     system("ha_log.sh", 
> "info", "Machine $esx:$vm->{'name'} is poweredoff and RESETPOWERON was 
> disabled");
>                                               } else {
>                                                       dielog("Could not 
> complete $esx:$vm->{'name'} power cycle");
>                                               }

> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to