Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled
hardware (in test environment), but it will be beneficial if you can help
me clear out some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes
down, what is the need or purpose of HA-host? (other than management server
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach
ACS uses to fence the host, but I couldn't find what mechanism or events
trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus <paul.an...@shapeblue.com> wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there is no
> way for CloudStack to 'fence' the host - that is shut it down to be sure
> that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Parth Patel <parthpatel2...@gmail.com>
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall <jms....@hotmail.co.uk>
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor <vic...@ihnetworks.com> wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =====
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =====
> >
> > But nothing happened for the  vm's in that node. I have waited for one
> > hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server still
> > thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > =======
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 ========
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
> > > the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > > I am obviously doing something wrong but cannot work it out.
> > >
> > >
> > > The management server has one NIC - 172.16.7.4
> > >
> > >
> > > Each compute node has 3 NICs -
> > >
> > >
> > >                                         cnode1
> > cnode2
> > >
> > >
> > > mangement NIC        172.16.7.5                   172.16.7.6
> > >
> > > vm NIC                      172.16.6.130                 172.16.6.131
> > >
> > > storage -                     172.16.250.4               172.16.250.5
> > >
> > >
> > > Dell LOM (for Idrac)   172.16.7.29                172.16.7.30
> > >
> > >
> > > the dell LOM IPs are the ones used to configure OOBM  in the UI
> > >
> > >
> > >
> > > If I pull the storage NIC presumably nothing will happen as the
> > > ipmitool
> > check is running across the management NIC so I need to pull both ?
> > >
> > > My understanding of host HA was the management server monitored the
> > compute nodes using ipmitool and if it did not get a response because
> > the host was down it would fence off that host and move the VMs to an
> > active compute node.
> > >
> > > This is obviously too simplistic so could someone explain how it is
> > meant to work and what it is protecting against ?
> > >
> > > ________________________________
> > > From: Paul Angus <paul.an...@shapeblue.com>
> > > Sent: 13 March 2018 07:01
> > > To: users@cloudstack.apache.org
> > > Subject: RE: KVM HostHA
> > >
> > > Hi all,
> > >
> > > One small note, unplugging the management NIC will only cause an HA
> > event if the storage is running over that NIC also.
> > >
> > > Is the storage is over a separate NIC then, the guest VMs will
> > > continue
> > to run when the mgmt. NIC is unplugged, Host HA will detect the disk
> > activity and conclude that there is nothing it can do, as the VMs are
> > still running other than mark the hosts as degraded.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.an...@shapeblue.com
> > > www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company
> <https://maps.google.com/?q=ack+Company+%0D%0A%3E+%3E+w&entry=gmail&source=g>
> <http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds.
> > > CSForge is
> > a framework developed by ShapeBlue to deli
> > <https://maps.google.com/?q=is+a+framework+developed+by+ShapeBlue+to+d
> > eli&entry=gmail&source=g>ver the rapid deployment of a standardised
> > ...
> > >
> > >
> > >
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Parth Patel <parthpatel2...@gmail.com>
> > > Sent: 12 March 2018 17:35
> > > To: users@cloudstack.apache.org
> > > Subject: Re: KVM HostHA
> > >
> > >> Hi Jon,
> > >>
> > >> As I said, in my case, making the host HA didn't work but by just
> > >> having a HA VM running on host and executing - (WARNING) "echo c >
> > >> /proc/sysrq-trigger" to simulate a kernel crash on host, the
> > >> management server registered it as down and started the VM on another
> > >> host. I know I've suggested this before but I insist you give this a
> > >> try. Also, you don't need to completely power off the machine manually
> > >> but just plugging out the network cable works fine. The cloudstack
> > >> agent after losing connection to management server auto reboots
> > >> because of KVM heartbeat check shell script mentioned by Rohit Yadav
> > >> to one of my earlier queries in other thread.
> > >>
> > >> On Mon 12 Mar, 2018, 21:23 Jon Marshall, <jms....@hotmail.co.uk>
> wrote:
> > >> Hi Paul
> > >>
> > >>
> > >> Thanks for the response.
> > >>
> > >>
> > >> I think I am not understanding how it was meant to work then. My
> > >> understanding was that the manager used ipmitool to just keep querying
> > >> the compute nodes as to their status so I assumed it didn't matter how
> > >> you shut the node down, once it was down the manager would get no
> > >> response and mark it as down (which it does).
> > >>
> > >>
> > >> I am in testing mode so I think I will just go and pull the power and
> > >> see what happens :)
> > >>
> > >>
> > >> Thanks
> > >>
> > >>
> > >> Jon
> > >>
> > >>
> > >> ________________________________
> > >> From: Paul Angus <paul.an...@shapeblue.com>
> > >> Sent: 12 March 2018 15:31
> > >> To: users@cloudstack.apache.org
> > >> Subject: RE: KVM HostHA
> > >> Hi Jon,
> > >>
> > >> I think that what you guys are finding, is that a controlled host
> > >> shutdown, which will cause the agent to shutdown cleanly; Is not
> > >> considered an HA event. I wouldn't expect CloudStack to take any
> > >> action if you shut down a host, only if the host (agent) stops
> > responding.
> > >>
> > >>
> > >>
> > >>
> > >> Kind regards,
> > >>
> > >> Paul Angus
> > >>
> > >> paul.an...@shapeblue.com
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]
> > >
> > > ]<
> > >> http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver the rapid deployment of a
> > standardised ...
> > >
> > >
> > >
> > >> www.shapeblue.com<http://www.shapeblue.com>
> > > [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> > http://www.shapeblue.com/>
> > >
> > > Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
> > > www.shapeblue.com
> > > Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is
> > a framework developed by ShapeBlue to deliver
> > <
> https://maps.google.com/?q=framework+developed+by+ShapeBlue+to+deliver&entry=gmail&source=g
> >
> > the rapid deployment of a standardised ...
> > >
> > >
> > >
> > >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> > >> is a framework developed by ShapeBlue to deliver the rapid deployment
> > >> of a standardised ...
> > >>
> > >>
> > >>
> > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
> > >>
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jon Marshall <jms....@hotmail.co.uk>
> > >> Sent: 12 March 2018 15:15
> > >> To: users@cloudstack.apache.org
> > >> Subject: Re: KVM HostHA
> > >>
> > >> I have the same issue here and am not entirely sure what the behaviour
> > >> should be.
> > >>
> > >>
> > >> I have one manager node and 2 compute nodes running 4.11 with ipmi
> > working
> > >> correctly.
> > >>
> > >>
> > >>  From the UI under HA -
> > >>
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >>
> > >> although interestingly from the "Details" tab it shows -
> > >>
> > >>
> > >> HA enabled No
> > >>
> > >>
> > >> which I assume is a cosmetic issue ?
> > >>
> > >>
> > >> On each compute node I have one HA enabled VM and one non HA enabled
> VM.
> > >>
> > >>
> > >> I power off a compute node and the UI updates the host status and the
> > VMs
> > >> on that node stop responding but they never fail over to the other
> node.
> > >>
> > >>
> > >> Couple of things I noticed -
> > >>
> > >>
> > >> 1) as soon as i power off the compute node the HA state on the other
> > node
> > >> shows "Ineligible"
> > >>
> > >>
> > >> 2) In the UI the instances all still show as green even though two of
> > them
> > >> are not available
> > >>
> > >>
> > >> Any help much appreciated
> > >>
> > >>
> > >>
> > >>
> > >> ________________________________
> > >> From: victor <vic...@ihnetworks.com>
> > >> Sent: 07 March 2018 17:01
> > >> To: users@cloudstack.apache.org
> > >> Subject: KVM HostHA
> > >>
> > >> Hello Guys,
> > >>
> > >> I have installed cloudstack 4.11. I have enabled HA for each hosts I
> > have
> > >> added. I have also added ipmi successfully (using ipmi driver).
> > >> The hosts are showing like the following.
> > >>
> > >> =======
> > >>
> > >> HA Enabled Yes
> > >> HA State Available
> > >> HA Provider kvmhaprovider
> > >>
> > >> ======
> > >>
> > >> Also the host is showing the following correctly
> > >>
> > >> Resource state --> Enabled
> > >> State --> UP
> > >> Power state --> On
> > >>
> > >> So I have shutdown one of the hosts to see how the KVM hosts Ha is
> > >> working. I have waited for half an hour. But nothing has happened.
> What
> > >> will happen to the VM's in that host, if the host failed to back up.
> > >> There isn't much from logs.
> > >>
> > >> Regards
> > >> Victor
> > >>
> >
> >
>

Reply via email to