I am just updating as I continue testing -

When i pulled the power lead as discussed below it goes from Suspect to Fencing 
but never gets to Fenced.  But when I put the power lead back in to the server 
CS almost immediately puts that server into maintenance mode and then does 
migrate the VM.


Not sure of the logic but at least I got to see a VM failover :)


________________________________
From: Jon Marshall <[email protected]>
Sent: 27 March 2018 10:42
To: [email protected]
Subject: Re: Failover for VMs

Just as an update to this before I forget what I did :) -


I used "echo c /proc/sysrq-trigger" on one of the compute nodes and there was 
no VM failover.  Instead HA reported suspect and then IPMI rebooted the 
machine, it came bacVM started responding to pings again.  IPMI is out of band 
so that seems to be reasonable behaviour but no use in testing HA.


Next I just pulled all 3 NIC cables  from the same compute node and again HA 
reported suspect.  Again IPMI rebooted but then HA state changed to "Recovered" 
which I don't understand as the NIC cables were still disconnected so VM was 
not reachable and no failover.


I don't understand how it can think the node is recovered as apart from the 
IPMI out of band connection there are no network connections to this server.


Finally pulled power lead and this time HA went from suspect to Fencing and 
then stayed that way. Again no VM failover.   This makes sense as no power 
means IPMI cannot reboot server so it never moves to Fenced I assume. Again no 
failover.


I am wondering if it is to do with out of band IPMI or the way I have the NICs 
setup.  The management node only has one NIC in the management network but I 
assume this is okay.


I may try reloading with CS v4.9 and just try failover without the new HA KVM 
to see if I see anything different.



Jon


________________________________
From: Jon Marshall <[email protected]>
Sent: 27 March 2018 10:10
To: [email protected]
Subject: Re: Failover for VMs


Thanks Paul, will pick up after Easter break.

Doing some more testing with HA KVM at the moment so any progress will update 
this thread


i
________________________________
From: Paul Angus <[email protected]
Sent: 27 March 2018 10:07
To: [email protected]
Subject: RE: Failover for VMr
Jon,

I've been updating the Ansible to move our physical hosts from Centos6 to 
Centos7, now that's done I'll run through an HA setup and post answers 
(probably after easter break).

[email protected]
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>

Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com<http://www.shapeblue.com>
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-----Original Message-----
From: Jon Marshall <[email protected]>
Sent: 27 March 2018 09:19
To: [email protected]
Subject: Failover for VMs

After 3 weeks of trying multiple different setups I still have not managed to 
get a VM to failover between compute nodes and am just running out of ideas.


I have 3 compute nodes each with 3 NICS (management, VMs traffic, storage), one 
management node with just a single NIC connection in the management network and 
a separate NFS server.


I have tried with and without the new Host HA KVM in CS v4.11 as from what I 
have read even without enabling the new Host HA KVM when you power off or 
reboot a compute node your VMs should still migrate.


I have tried powering off a compute node, pulling the power lead, removing the 
management and NFS network cables and the management server just seems to carry 
on as if nothing has happened.


Could someone explain exactly how HA is meant to work so I can look at where it 
is going wrong.

Reply via email to