The management server doesn't ping the host through IPMI.   However if IPMI is 
not available, you will not be able to use Host HA, as there is no way for 
CloudStack to 'fence' the host - that is shut it down to be sure that a VM 
cannot start again on that host.

I can explain why that is necessary if you wish.


Kind regards,

Paul Angus

paul.an...@shapeblue.comĀ 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-----Original Message-----
From: Parth Patel <parthpatel2...@gmail.com> 
Sent: 13 March 2018 16:57
To: users@cloudstack.apache.org
Cc: Jon Marshall <jms....@hotmail.co.uk>
Subject: Re: KVM HostHA

Hi Jon and Victor,

I think the management server pings your host using ipmi (I really don't hope 
this is the case).
In my case, I did not have OOBM enabled at all (my hardware didn't support
it)
I think you could disable OOBM and/or HA-Host and give that a try :)

On Tue, 13 Mar 2018 at 20:40 victor <vic...@ihnetworks.com> wrote:

> Hello Guys,
>
> I have tried the following two cases.
>
> 1, "echo c > /proc/sysrq-trigger"
>
> 2, Pulled the network cable of one of the host
>
> In both cases, the following happened.
>
> =====
> 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes 
> of to disconnect
> 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is 
> disconnecting with event AgentDisconnected
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already 
> Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link 
> for
> 4 with state Alert
> 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4 
> =====
>
> But nothing happened for the  vm's in that node. I have waited for one 
> hour and the VM's in that node has been migrated to the other 
> available hosts. I think the issue is that the management server still 
> thinks that the VM's in that host is running. Please check the 
> following logs
>
> =======
> 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not 
> running on host 4 ========
>
>
> On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into 
> > the
> server but it did not stop the server responding to an ipmitool 
> request on the manager eg -
> >
> >
> > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis status"
> >
> >
> > from the management server got an answer saying the chassis power 
> > was on
> so CS never registered the compute node as down.
> >
> >
> > I am obviously doing something wrong but cannot work it out.
> >
> >
> > The management server has one NIC - 172.16.7.4
> >
> >
> > Each compute node has 3 NICs -
> >
> >
> >                                         cnode1
> cnode2
> >
> >
> > mangement NIC        172.16.7.5                   172.16.7.6
> >
> > vm NIC                      172.16.6.130                 172.16.6.131
> >
> > storage -                     172.16.250.4               172.16.250.5
> >
> >
> > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> >
> >
> > the dell LOM IPs are the ones used to configure OOBM  in the UI
> >
> >
> >
> > If I pull the storage NIC presumably nothing will happen as the 
> > ipmitool
> check is running across the management NIC so I need to pull both ?
> >
> > My understanding of host HA was the management server monitored the
> compute nodes using ipmitool and if it did not get a response because 
> the host was down it would fence off that host and move the VMs to an 
> active compute node.
> >
> > This is obviously too simplistic so could someone explain how it is
> meant to work and what it is protecting against ?
> >
> > ________________________________
> > From: Paul Angus <paul.an...@shapeblue.com>
> > Sent: 13 March 2018 07:01
> > To: users@cloudstack.apache.org
> > Subject: RE: KVM HostHA
> >
> > Hi all,
> >
> > One small note, unplugging the management NIC will only cause an HA
> event if the storage is running over that NIC also.
> >
> > Is the storage is over a separate NIC then, the guest VMs will 
> > continue
> to run when the mgmt. NIC is unplugged, Host HA will detect the disk 
> activity and conclude that there is nothing it can do, as the VMs are 
> still running other than mark the hosts as degraded.
> >
> >
> > Kind regards,
> >
> > Paul Angus
> >
> > paul.an...@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. 
> > CSForge is
> a framework developed by ShapeBlue to deli 
> <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> eli&entry=gmail&source=g>ver the rapid deployment of a standardised 
> ...
> >
> >
> >
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> >
> >
> >
> >
> > -----Original Message-----
> > From: Parth Patel <parthpatel2...@gmail.com>
> > Sent: 12 March 2018 17:35
> > To: users@cloudstack.apache.org
> > Subject: Re: KVM HostHA
> >
> >> Hi Jon,
> >>
> >> As I said, in my case, making the host HA didn't work but by just
> >> having a HA VM running on host and executing - (WARNING) "echo c >
> >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> >> management server registered it as down and started the VM on another
> >> host. I know I've suggested this before but I insist you give this a
> >> try. Also, you don't need to completely power off the machine manually
> >> but just plugging out the network cable works fine. The cloudstack
> >> agent after losing connection to management server auto reboots
> >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> >> to one of my earlier queries in other thread.
> >>
> >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jms....@hotmail.co.uk> wrote:
> >> Hi Paul
> >>
> >>
> >> Thanks for the response.
> >>
> >>
> >> I think I am not understanding how it was meant to work then. My
> >> understanding was that the manager used ipmitool to just keep querying
> >> the compute nodes as to their status so I assumed it didn't matter how
> >> you shut the node down, once it was down the manager would get no
> >> response and mark it as down (which it does).
> >>
> >>
> >> I am in testing mode so I think I will just go and pull the power and
> >> see what happens :)
> >>
> >>
> >> Thanks
> >>
> >>
> >> Jon
> >>
> >>
> >> ________________________________
> >> From: Paul Angus <paul.an...@shapeblue.com>
> >> Sent: 12 March 2018 15:31
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >> Hi Jon,
> >>
> >> I think that what you guys are finding, is that a controlled host
> >> shutdown, which will cause the agent to shutdown cleanly; Is not
> >> considered an HA event. I wouldn't expect CloudStack to take any
> >> action if you shut down a host, only if the host (agent) stops
> responding.
> >>
> >>
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.an...@shapeblue.com
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> >
> > ]<
> >> http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
> >
> >
> >
> >> www.shapeblue.com<http://www.shapeblue.com>
> > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
> >
> > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > www.shapeblue.com
> > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is
> a framework developed by ShapeBlue to deliver
> <https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g>
> the rapid deployment of a standardised ...
> >
> >
> >
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> >> is a framework developed by ShapeBlue to deliver the rapid deployment
> >> of a standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Jon Marshall <jms....@hotmail.co.uk>
> >> Sent: 12 March 2018 15:15
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> I have the same issue here and am not entirely sure what the behaviour
> >> should be.
> >>
> >>
> >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> working
> >> correctly.
> >>
> >>
> >>  From the UI under HA -
> >>
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >>
> >> although interestingly from the "Details" tab it shows -
> >>
> >>
> >> HA enabled No
> >>
> >>
> >> which I assume is a cosmetic issue ?
> >>
> >>
> >> On each compute node I have one HA enabled VM and one non HA enabled VM.
> >>
> >>
> >> I power off a compute node and the UI updates the host status and the
> VMs
> >> on that node stop responding but they never fail over to the other node.
> >>
> >>
> >> Couple of things I noticed -
> >>
> >>
> >> 1) as soon as i power off the compute node the HA state on the other
> node
> >> shows "Ineligible"
> >>
> >>
> >> 2) In the UI the instances all still show as green even though two of
> them
> >> are not available
> >>
> >>
> >> Any help much appreciated
> >>
> >>
> >>
> >>
> >> ________________________________
> >> From: victor <vic...@ihnetworks.com>
> >> Sent: 07 March 2018 17:01
> >> To: users@cloudstack.apache.org
> >> Subject: KVM HostHA
> >>
> >> Hello Guys,
> >>
> >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> have
> >> added. I have also added ipmi successfully (using ipmi driver).
> >> The hosts are showing like the following.
> >>
> >> =======
> >>
> >> HA Enabled Yes
> >> HA State Available
> >> HA Provider kvmhaprovider
> >>
> >> ======
> >>
> >> Also the host is showing the following correctly
> >>
> >> Resource state --> Enabled
> >> State --> UP
> >> Power state --> On
> >>
> >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> >> working. I have waited for half an hour. But nothing has happened. What
> >> will happen to the VM's in that host, if the host failed to back up.
> >> There isn't much from logs.
> >>
> >> Regards
> >> Victor
> >>
>
>

Reply via email to