Yes, fencing must be working otherwise HA does not work. So in the case of a power supply failure we have to have a server with a redundant power supply to previse this scenario?
----- Original Message ----- From: "René Koch" <[email protected]> To: [email protected], "Gianluca Cecchi" <[email protected]> Cc: "users" <[email protected]> Sent: Terça-feira, 16 de Abril de 2013 13:31:48 Subject: RE: [Users] High Availability -----Original message----- > From:[email protected] <[email protected]> > Sent: Tuesday 16th April 2013 14:03 > To: Gianluca Cecchi <[email protected]> > Cc: René Koch <[email protected]>; users <[email protected]> > Subject: Re: [Users] High Availability > > Well, we also disconnected the ilo NIC cable. We did another test, and just > disconnected the NIC cables but the ilo NIC cable, and voilá the HA took > about 3 minutes to migrate the VM to the other host. We notice too that the > manager did a reboot to the failed host. For a more real scenario we > disconnected the power cable from the host and after about 2 or 3 minutes the > manager put the host in non-responsive and the VM in unknown state. Is this > the correct behavior? Fencing means that the non-responsive host gets reseted (powered off and on). If fencing isn't working (as you disconnected the power cable and so ILO can't send you a success message) the vms want get started on another host. In your example this seems to be strange, but lets have a look at the following scenario: - You have 2 datacenters with 1 hypervisor in DC 1 and 1 hypervisor in DC 2, ovirt-engine is running in DC 1 - Connection between dcs is lost - Fencing isn't working - VM is running on host in DC 2 - If VM would start on host in DC 1 without successful fencing your vm disk would be broken (host in DC 2 and DC 1 is writing on the same storage file) Maybe there are better examples then this one (would be interesting to know what your storage metro-cluster is doing in this scenario with this split-brain-situation), but I hope it's clear to you why fencing is working as it is and what can happen if it would be less restrictive... Regards, René > > Regards > Jose > > ----- Mensagem original ----- > De: "Gianluca Cecchi" <[email protected]> > Para: [email protected] > Cc: "René Koch (ovido)" <[email protected]>, "users" <[email protected]> > Enviadas: Terça-feira, 16 Abril, 2013 12:12:43 > Assunto: Re: [Users] High Availability > > On Tue, Apr 16, 2013 at 12:56 PM, suporte wrote: > > Hi, > > > > We have 2 Fujitsu servers and one iSCSI storage domain. The servers have > > the power management configured with ilo3. > > We can live migrate a VM and when rebooting the host of that VM it does the > > migration to the other host. > > > > For testing high availability we disconnected all NIC cables of the VM > > host, the VM does not migrate to the other host, we had to manually confirm > > the host has been rebooted, and than migration happens. > > > > Is this the correct behavior? We have to manually confirm that the host has > > been rebooted for HA happens? > > > > Regards > > Jose > > Hello, > when you say "we disconnected all NIC cables" you mean "we > disconnected all NIC cables but the ones connected to the iLO > interface", correct? > Because to know that one host has successfully fenced the problematic > one, it has to send a get status message and see that it is off or > that it has been successfully rebooted..... > > For esxample in RHCS if you configure iLO as a fencing device it > remains indefinitely in state similar to > > wait for fence to complete > > if the "fencer" is not able to get an acknowledge about the operation > or to reach the other node iLO. > Probably you can find something in your logs... > > Gianluca >
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

