Yes, fencing must be working otherwise HA does not work. So in the case of a 
power supply failure we have to have a server with a redundant power supply to 
previse this scenario? 

----- Original Message -----

From: "René Koch" <[email protected]> 
To: [email protected], "Gianluca Cecchi" <[email protected]> 
Cc: "users" <[email protected]> 
Sent: Terça-feira, 16 de Abril de 2013 13:31:48 
Subject: RE: [Users] High Availability 



-----Original message----- 
> From:[email protected] <[email protected]> 
> Sent: Tuesday 16th April 2013 14:03 
> To: Gianluca Cecchi <[email protected]> 
> Cc: René Koch <[email protected]>; users <[email protected]> 
> Subject: Re: [Users] High Availability 
> 
> Well, we also disconnected the ilo NIC cable. We did another test, and just 
> disconnected the NIC cables but the ilo NIC cable, and voilá the HA took 
> about 3 minutes to migrate the VM to the other host. We notice too that the 
> manager did a reboot to the failed host. For a more real scenario we 
> disconnected the power cable from the host and after about 2 or 3 minutes the 
> manager put the host in non-responsive and the VM in unknown state. Is this 
> the correct behavior? 


Fencing means that the non-responsive host gets reseted (powered off and on). 
If fencing isn't working (as you disconnected the power cable and so ILO can't 
send you a success message) the vms want get started on another host. 
In your example this seems to be strange, but lets have a look at the following 
scenario: 
- You have 2 datacenters with 1 hypervisor in DC 1 and 1 hypervisor in DC 2, 
ovirt-engine is running in DC 1 
- Connection between dcs is lost 
- Fencing isn't working 
- VM is running on host in DC 2 
- If VM would start on host in DC 1 without successful fencing your vm disk 
would be broken (host in DC 2 and DC 1 is writing on the same storage file) 

Maybe there are better examples then this one (would be interesting to know 
what your storage metro-cluster is doing in this scenario with this 
split-brain-situation), but I hope it's clear to you why fencing is working as 
it is and what can happen if it would be less restrictive... 


Regards, 
René 


> 
> Regards 
> Jose 
> 
> ----- Mensagem original ----- 
> De: "Gianluca Cecchi" <[email protected]> 
> Para: [email protected] 
> Cc: "René Koch (ovido)" <[email protected]>, "users" <[email protected]> 
> Enviadas: Terça-feira, 16 Abril, 2013 12:12:43 
> Assunto: Re: [Users] High Availability 
> 
> On Tue, Apr 16, 2013 at 12:56 PM, suporte wrote: 
> > Hi, 
> > 
> > We have 2 Fujitsu servers and one iSCSI storage domain. The servers have 
> > the power management configured with ilo3. 
> > We can live migrate a VM and when rebooting the host of that VM it does the 
> > migration to the other host. 
> > 
> > For testing high availability we disconnected all NIC cables of the VM 
> > host, the VM does not migrate to the other host, we had to manually confirm 
> > the host has been rebooted, and than migration happens. 
> > 
> > Is this the correct behavior? We have to manually confirm that the host has 
> > been rebooted for HA happens? 
> > 
> > Regards 
> > Jose 
> 
> Hello, 
> when you say "we disconnected all NIC cables" you mean "we 
> disconnected all NIC cables but the ones connected to the iLO 
> interface", correct? 
> Because to know that one host has successfully fenced the problematic 
> one, it has to send a get status message and see that it is off or 
> that it has been successfully rebooted..... 
> 
> For esxample in RHCS if you configure iLO as a fencing device it 
> remains indefinitely in state similar to 
> 
> wait for fence to complete 
> 
> if the "fencer" is not able to get an acknowledge about the operation 
> or to reach the other node iLO. 
> Probably you can find something in your logs... 
> 
> Gianluca 
> 

_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to