Re: Edit Public Range

2018-03-14 Thread Swastik Mittal
Hey Matheus,

You can edit your range of IP's by going into infrastructure->pod->edit.
There change the end ip to whatever you want. It will not allow, if your
guest ip range overlaps with your management ip range being on the basic
network. So make sure you take care of that.

Regards
Swastik

On Thu, Mar 15, 2018 at 9:45 AM, Matheus Fontes  wrote:

> Hi,
> Can I change/edit a Zone’s Public IP Range?
> Ex:
> my conf
> start 200.0.0.1 end 200.0.0.20 netmask 255.255.240.0
> desired conf
> start 200.0.0.1 end 200.0.10.254 netmask 255.255.240.0
>
> I need more Public IPs but on initial config I set only 0.1 to 0.20 range.
>
> Thanks
> Matheus Fontes


Re: KVM HostHA

2018-03-14 Thread Parth Patel
Hi Paul and Adrina,

I don't know the functioning of Host-HA features but what Paul explained,
my ACS 4.11 does the same without even host HA or ipmi access. As I stated
earlier multiple times, without host HA and ipmi, my ha-enabled VMs
executing on a normal host get restarted on another suitable host in
cluster after approximately 3 minutes of event ping timeout. After which
the cloudstack agent with no connection to management server because of
unplugged NIC (all my machines currently have only one NIC / whole zone is
in a flat network) reboots itself (the reason was explained by Rohit in an
another thread). The management server marks the host down and only
Ha-enabled VMs executing on it get restarted on another host (without any
mention of host HA or ipmi or fencing in management server logs) while
normal VMs executing on it are stopped.

I don't know if this was a desired outcome, but I think my current ACS 4.11
installation has features (at least performs some ;) provided by Host HA
without configuring it or ipmi.

Regards,
Parth Patel

On Wed 14 Mar, 2018, 18:41 Boris Stoyanov, 
wrote:

> yes, KVM + NFS shared storage.
>
> Boris.
>
>
> boris.stoya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> > On 14 Mar 2018, at 14:51, Andrija Panic  wrote:
> >
> > Hi Boris,
> >
> > ok thanks for the explanation - that makes sense, and covers my
> "exception
> > case" that I have.
> >
> > This is atm only available for NFS as I could read (KVM on NFS) ?
> >
> > Cheers
> >
> > On 14 March 2018 at 13:02, Boris Stoyanov 
> > wrote:
> >
> >> Hi Andrija,
> >>
> >> There’s two types of checks Host-HA is doing to determine if host if
> >> healthy.
> >>
> >> 1. Health checks - pings the host as soon as there’s connection issues
> >> with the agent
> >>
> >> If that fails,
> >>
> >> 2. Activity checks - checks if there are any writing operations on the
> >> Disks of the VMs that are running on the hosts. This is to determine if
> the
> >> VMs are actually alive and executing processes. Only if no disk
> operations
> >> are executed on the shared storage, only then it’s trying to Recover the
> >> host with IPMI call, if that eventually fails, it migrates the VMs to a
> >> healthy host and Fences the faulty one.
> >>
> >> Hope that explains your case.
> >>
> >> Boris.
> >>
> >>
> >> boris.stoya...@shapeblue.com
> >> www.shapeblue.com
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>> On 14 Mar 2018, at 13:53, Andrija Panic 
> wrote:
> >>>
> >>> Hi Paul,
> >>>
> >>> sorry to bump in the middle of the thread, but just curious about the
> >> idea
> >>> behing host-HA and why it behaves the way you exlained above:
> >>>
> >>>
> >>> Would it be more sense (or not?), that when MGMT detects agents is
> >>> unreachable or host unreachable (or after unsuccessful i.e. agent
> >> restart,
> >>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> >>> making sure no VMS running and then to really start all HA-enabled VMs
> on
> >>> other hosts ?
> >>>
> >>> I'm just trying to make parallel to the corosync/pacemaker as
> clustering
> >>> suite/services in Linux (RHEL and others), where when majority of nodes
> >>> detect that one node is down, a common thing (especially for shared
> >>> storage) is to STONITH that node, make sure it;s down, then move
> >> "resource"
> >>> (in our case VMs) to other cluster nodes ?
> >>>
> >>> I see it's  actually much broader setup per
> >>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> >> again -
> >>> whole idea (in my head at least...) is when host get's down, we make
> sure
> >>> it's down (avoid VM corruption, by doint STONITH to that node) and then
> >>> start HA VMs on ohter hosts.
> >>>
> >>> I understand there might be exceptions as I have right now (4.8) -
> >> libvirt
> >>> get stuck (librbd exception or similar) so agent get's disconnected,
> but
> >>> VMs are still running fine... (except DB get messed up, all NICs loose
> >>> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> >>>
> >>>
> >>> Thanks
> >>> Andrija
> >>>
> >>>
> >>>
> >>>
> >>> On 14 March 2018 at 10:57, Jon Marshall  wrote:
> >>>
>  That would make sense.
> 
> 
>  I have another server being used for something else at the moment so I
>  will add that in and update this thread when I have tested
> 
> 
>  Jon
> 
> 
>  
>  From: Paul Angus 
>  Sent: 14 March 2018 09:16
>  To: users@cloudstack.apache.org
>  Subject: RE: KVM HostHA
> 
>  I'd need to do some testing, but I suspect that your problem is that
> you
>  only have two hosts.  At the point that one host is deemed out of
> >> service,
>  

Re: KVM HostHA

2018-03-14 Thread Boris Stoyanov
yes, KVM + NFS shared storage. 

Boris. 


boris.stoya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

> On 14 Mar 2018, at 14:51, Andrija Panic  wrote:
> 
> Hi Boris,
> 
> ok thanks for the explanation - that makes sense, and covers my "exception
> case" that I have.
> 
> This is atm only available for NFS as I could read (KVM on NFS) ?
> 
> Cheers
> 
> On 14 March 2018 at 13:02, Boris Stoyanov 
> wrote:
> 
>> Hi Andrija,
>> 
>> There’s two types of checks Host-HA is doing to determine if host if
>> healthy.
>> 
>> 1. Health checks - pings the host as soon as there’s connection issues
>> with the agent
>> 
>> If that fails,
>> 
>> 2. Activity checks - checks if there are any writing operations on the
>> Disks of the VMs that are running on the hosts. This is to determine if the
>> VMs are actually alive and executing processes. Only if no disk operations
>> are executed on the shared storage, only then it’s trying to Recover the
>> host with IPMI call, if that eventually fails, it migrates the VMs to a
>> healthy host and Fences the faulty one.
>> 
>> Hope that explains your case.
>> 
>> Boris.
>> 
>> 
>> boris.stoya...@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>>> On 14 Mar 2018, at 13:53, Andrija Panic  wrote:
>>> 
>>> Hi Paul,
>>> 
>>> sorry to bump in the middle of the thread, but just curious about the
>> idea
>>> behing host-HA and why it behaves the way you exlained above:
>>> 
>>> 
>>> Would it be more sense (or not?), that when MGMT detects agents is
>>> unreachable or host unreachable (or after unsuccessful i.e. agent
>> restart,
>>> etc...,to be defined), to actually use IPMI to STONITH the node, thus
>>> making sure no VMS running and then to really start all HA-enabled VMs on
>>> other hosts ?
>>> 
>>> I'm just trying to make parallel to the corosync/pacemaker as clustering
>>> suite/services in Linux (RHEL and others), where when majority of nodes
>>> detect that one node is down, a common thing (especially for shared
>>> storage) is to STONITH that node, make sure it;s down, then move
>> "resource"
>>> (in our case VMs) to other cluster nodes ?
>>> 
>>> I see it's  actually much broader setup per
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
>> again -
>>> whole idea (in my head at least...) is when host get's down, we make sure
>>> it's down (avoid VM corruption, by doint STONITH to that node) and then
>>> start HA VMs on ohter hosts.
>>> 
>>> I understand there might be exceptions as I have right now (4.8) -
>> libvirt
>>> get stuck (librbd exception or similar) so agent get's disconnected, but
>>> VMs are still running fine... (except DB get messed up, all NICs loose
>>> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
>>> 
>>> 
>>> Thanks
>>> Andrija
>>> 
>>> 
>>> 
>>> 
>>> On 14 March 2018 at 10:57, Jon Marshall  wrote:
>>> 
 That would make sense.
 
 
 I have another server being used for something else at the moment so I
 will add that in and update this thread when I have tested
 
 
 Jon
 
 
 
 From: Paul Angus 
 Sent: 14 March 2018 09:16
 To: users@cloudstack.apache.org
 Subject: RE: KVM HostHA
 
 I'd need to do some testing, but I suspect that your problem is that you
 only have two hosts.  At the point that one host is deemed out of
>> service,
 you only have one host left.  With only one host, CloudStack will show
>> the
 cluster as ineligible.
 
 It is extremely common for any system working as a cluster to require a
 minimum starting point of 3 nodes to be able to function.
 
 
 Kind regards,
 
 Paul Angus
 
 paul.an...@shapeblue.com
 www.shapeblue.com
 [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
 http://www.shapeblue.com/>
 
 Shapeblue - The CloudStack Company
 www.shapeblue.com
 Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
>> is a
 framework developed by ShapeBlue to deliver the rapid deployment of a
 standardised ...
 
 
 
 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
 @shapeblue
 
 
 
 
 -Original Message-
 From: Jon Marshall 
 Sent: 14 March 2018 08:36
 To: users@cloudstack.apache.org
 Subject: Re: KVM HostHA
 
 Hi Paul
 
 
 My testing does indeed end up with the failed host in maintenance mode
>> but
 the VMs are never migrated. As I posted earlier the management server
>> seems
 to be saying there is no other host that the VM can be migrated to.
 
 
 

Re: KVM HostHA

2018-03-14 Thread Andrija Panic
Hi Boris,

ok thanks for the explanation - that makes sense, and covers my "exception
case" that I have.

This is atm only available for NFS as I could read (KVM on NFS) ?

Cheers

On 14 March 2018 at 13:02, Boris Stoyanov 
wrote:

> Hi Andrija,
>
> There’s two types of checks Host-HA is doing to determine if host if
> healthy.
>
> 1. Health checks - pings the host as soon as there’s connection issues
> with the agent
>
> If that fails,
>
> 2. Activity checks - checks if there are any writing operations on the
> Disks of the VMs that are running on the hosts. This is to determine if the
> VMs are actually alive and executing processes. Only if no disk operations
> are executed on the shared storage, only then it’s trying to Recover the
> host with IPMI call, if that eventually fails, it migrates the VMs to a
> healthy host and Fences the faulty one.
>
> Hope that explains your case.
>
> Boris.
>
>
> boris.stoya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
> > On 14 Mar 2018, at 13:53, Andrija Panic  wrote:
> >
> > Hi Paul,
> >
> > sorry to bump in the middle of the thread, but just curious about the
> idea
> > behing host-HA and why it behaves the way you exlained above:
> >
> >
> > Would it be more sense (or not?), that when MGMT detects agents is
> > unreachable or host unreachable (or after unsuccessful i.e. agent
> restart,
> > etc...,to be defined), to actually use IPMI to STONITH the node, thus
> > making sure no VMS running and then to really start all HA-enabled VMs on
> > other hosts ?
> >
> > I'm just trying to make parallel to the corosync/pacemaker as clustering
> > suite/services in Linux (RHEL and others), where when majority of nodes
> > detect that one node is down, a common thing (especially for shared
> > storage) is to STONITH that node, make sure it;s down, then move
> "resource"
> > (in our case VMs) to other cluster nodes ?
> >
> > I see it's  actually much broader setup per
> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but
> again -
> > whole idea (in my head at least...) is when host get's down, we make sure
> > it's down (avoid VM corruption, by doint STONITH to that node) and then
> > start HA VMs on ohter hosts.
> >
> > I understand there might be exceptions as I have right now (4.8) -
> libvirt
> > get stuck (librbd exception or similar) so agent get's disconnected, but
> > VMs are still running fine... (except DB get messed up, all NICs loose
> > isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> >
> >
> > Thanks
> > Andrija
> >
> >
> >
> >
> > On 14 March 2018 at 10:57, Jon Marshall  wrote:
> >
> >> That would make sense.
> >>
> >>
> >> I have another server being used for something else at the moment so I
> >> will add that in and update this thread when I have tested
> >>
> >>
> >> Jon
> >>
> >>
> >> 
> >> From: Paul Angus 
> >> Sent: 14 March 2018 09:16
> >> To: users@cloudstack.apache.org
> >> Subject: RE: KVM HostHA
> >>
> >> I'd need to do some testing, but I suspect that your problem is that you
> >> only have two hosts.  At the point that one host is deemed out of
> service,
> >> you only have one host left.  With only one host, CloudStack will show
> the
> >> cluster as ineligible.
> >>
> >> It is extremely common for any system working as a cluster to require a
> >> minimum starting point of 3 nodes to be able to function.
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >> paul.an...@shapeblue.com
> >> www.shapeblue.com
> >> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> >> http://www.shapeblue.com/>
> >>
> >> Shapeblue - The CloudStack Company
> >> www.shapeblue.com
> >> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge
> is a
> >> framework developed by ShapeBlue to deliver the rapid deployment of a
> >> standardised ...
> >>
> >>
> >>
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>
> >> -Original Message-
> >> From: Jon Marshall 
> >> Sent: 14 March 2018 08:36
> >> To: users@cloudstack.apache.org
> >> Subject: Re: KVM HostHA
> >>
> >> Hi Paul
> >>
> >>
> >> My testing does indeed end up with the failed host in maintenance mode
> but
> >> the VMs are never migrated. As I posted earlier the management server
> seems
> >> to be saying there is no other host that the VM can be migrated to.
> >>
> >>
> >> Couple of questions if you have the time to respond -
> >>
> >>
> >> 1) this article seems to suggest a reboot or powering off a host will
> end
> >> result in the VMs being migrated and this was on CS v 4.2.1 back in
> 2013 so
> >> does Host HA do something different
> >>
> >>
> >> 2) Whenever one of my two nodes is taken down in testing the active
> >> 

Re: KVM HostHA

2018-03-14 Thread Boris Stoyanov
Hi Andrija, 

There’s two types of checks Host-HA is doing to determine if host if healthy.

1. Health checks - pings the host as soon as there’s connection issues with the 
agent

If that fails, 

2. Activity checks - checks if there are any writing operations on the Disks of 
the VMs that are running on the hosts. This is to determine if the VMs are 
actually alive and executing processes. Only if no disk operations are executed 
on the shared storage, only then it’s trying to Recover the host with IPMI 
call, if that eventually fails, it migrates the VMs to a healthy host and 
Fences the faulty one. 

Hope that explains your case. 

Boris.


boris.stoya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

> On 14 Mar 2018, at 13:53, Andrija Panic  wrote:
> 
> Hi Paul,
> 
> sorry to bump in the middle of the thread, but just curious about the idea
> behing host-HA and why it behaves the way you exlained above:
> 
> 
> Would it be more sense (or not?), that when MGMT detects agents is
> unreachable or host unreachable (or after unsuccessful i.e. agent restart,
> etc...,to be defined), to actually use IPMI to STONITH the node, thus
> making sure no VMS running and then to really start all HA-enabled VMs on
> other hosts ?
> 
> I'm just trying to make parallel to the corosync/pacemaker as clustering
> suite/services in Linux (RHEL and others), where when majority of nodes
> detect that one node is down, a common thing (especially for shared
> storage) is to STONITH that node, make sure it;s down, then move "resource"
> (in our case VMs) to other cluster nodes ?
> 
> I see it's  actually much broader setup per
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but again -
> whole idea (in my head at least...) is when host get's down, we make sure
> it's down (avoid VM corruption, by doint STONITH to that node) and then
> start HA VMs on ohter hosts.
> 
> I understand there might be exceptions as I have right now (4.8) - libvirt
> get stuck (librbd exception or similar) so agent get's disconnected, but
> VMs are still running fine... (except DB get messed up, all NICs loose
> isolation_uri, VR's loose MAC addresses and other IP addresses etc...)
> 
> 
> Thanks
> Andrija
> 
> 
> 
> 
> On 14 March 2018 at 10:57, Jon Marshall  wrote:
> 
>> That would make sense.
>> 
>> 
>> I have another server being used for something else at the moment so I
>> will add that in and update this thread when I have tested
>> 
>> 
>> Jon
>> 
>> 
>> 
>> From: Paul Angus 
>> Sent: 14 March 2018 09:16
>> To: users@cloudstack.apache.org
>> Subject: RE: KVM HostHA
>> 
>> I'd need to do some testing, but I suspect that your problem is that you
>> only have two hosts.  At the point that one host is deemed out of service,
>> you only have one host left.  With only one host, CloudStack will show the
>> cluster as ineligible.
>> 
>> It is extremely common for any system working as a cluster to require a
>> minimum starting point of 3 nodes to be able to function.
>> 
>> 
>> Kind regards,
>> 
>> Paul Angus
>> 
>> paul.an...@shapeblue.com
>> www.shapeblue.com
>> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
>> http://www.shapeblue.com/>
>> 
>> Shapeblue - The CloudStack Company
>> www.shapeblue.com
>> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
>> framework developed by ShapeBlue to deliver the rapid deployment of a
>> standardised ...
>> 
>> 
>> 
>> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
>> @shapeblue
>> 
>> 
>> 
>> 
>> -Original Message-
>> From: Jon Marshall 
>> Sent: 14 March 2018 08:36
>> To: users@cloudstack.apache.org
>> Subject: Re: KVM HostHA
>> 
>> Hi Paul
>> 
>> 
>> My testing does indeed end up with the failed host in maintenance mode but
>> the VMs are never migrated. As I posted earlier the management server seems
>> to be saying there is no other host that the VM can be migrated to.
>> 
>> 
>> Couple of questions if you have the time to respond -
>> 
>> 
>> 1) this article seems to suggest a reboot or powering off a host will end
>> result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so
>> does Host HA do something different
>> 
>> 
>> 2) Whenever one of my two nodes is taken down in testing the active
>> compute nodes HA status goes from Available to Ineligible. Should this
>> happen ie. is it going to Ineligible stopping the manager from migrating
>> the VMs.
>> 
>> 
>> Apologies for all the questions but I just can't get this to work at the
>> moment. If I do eventually get it working I will do a write up for others
>> with same issue :)
>> 
>> 
>> 
>> From: Paul Angus 
>> Sent: 14 March 2018 07:45
>> To: users@cloudstack.apache.org
>> 

Re: KVM HostHA

2018-03-14 Thread Andrija Panic
Hi Paul,

sorry to bump in the middle of the thread, but just curious about the idea
behing host-HA and why it behaves the way you exlained above:


Would it be more sense (or not?), that when MGMT detects agents is
unreachable or host unreachable (or after unsuccessful i.e. agent restart,
etc...,to be defined), to actually use IPMI to STONITH the node, thus
making sure no VMS running and then to really start all HA-enabled VMs on
other hosts ?

I'm just trying to make parallel to the corosync/pacemaker as clustering
suite/services in Linux (RHEL and others), where when majority of nodes
detect that one node is down, a common thing (especially for shared
storage) is to STONITH that node, make sure it;s down, then move "resource"
(in our case VMs) to other cluster nodes ?

I see it's  actually much broader setup per
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA but again -
whole idea (in my head at least...) is when host get's down, we make sure
it's down (avoid VM corruption, by doint STONITH to that node) and then
start HA VMs on ohter hosts.

I understand there might be exceptions as I have right now (4.8) - libvirt
get stuck (librbd exception or similar) so agent get's disconnected, but
VMs are still running fine... (except DB get messed up, all NICs loose
isolation_uri, VR's loose MAC addresses and other IP addresses etc...)


Thanks
Andrija




On 14 March 2018 at 10:57, Jon Marshall  wrote:

> That would make sense.
>
>
> I have another server being used for something else at the moment so I
> will add that in and update this thread when I have tested
>
>
> Jon
>
>
> 
> From: Paul Angus 
> Sent: 14 March 2018 09:16
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> I'd need to do some testing, but I suspect that your problem is that you
> only have two hosts.  At the point that one host is deemed out of service,
> you only have one host left.  With only one host, CloudStack will show the
> cluster as ineligible.
>
> It is extremely common for any system working as a cluster to require a
> minimum starting point of 3 nodes to be able to function.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> [http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<
> http://www.shapeblue.com/>
>
> Shapeblue - The CloudStack Company
> www.shapeblue.com
> Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a
> framework developed by ShapeBlue to deliver the rapid deployment of a
> standardised ...
>
>
>
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -Original Message-
> From: Jon Marshall 
> Sent: 14 March 2018 08:36
> To: users@cloudstack.apache.org
> Subject: Re: KVM HostHA
>
> Hi Paul
>
>
> My testing does indeed end up with the failed host in maintenance mode but
> the VMs are never migrated. As I posted earlier the management server seems
> to be saying there is no other host that the VM can be migrated to.
>
>
> Couple of questions if you have the time to respond -
>
>
> 1) this article seems to suggest a reboot or powering off a host will end
> result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so
> does Host HA do something different
>
>
> 2) Whenever one of my two nodes is taken down in testing the active
> compute nodes HA status goes from Available to Ineligible. Should this
> happen ie. is it going to Ineligible stopping the manager from migrating
> the VMs.
>
>
> Apologies for all the questions but I just can't get this to work at the
> moment. If I do eventually get it working I will do a write up for others
> with same issue :)
>
>
> 
> From: Paul Angus 
> Sent: 14 March 2018 07:45
> To: users@cloudstack.apache.org
> Subject: RE: KVM HostHA
>
> Hi Parth,
>
> Two answer your questions, VM-HA does not restart VMs on an alternate host
> if the original host goes down.  The management server (without host-HA)
> cannot tell what happened to the host.  It cannot tell if there was a
> failure in the agent, loss of connectivity to the management NIC or if the
> host is truly down.  In the first two scenarios, the guest VMs can still be
> running perfectly well, and to restart them elsewhere would be very
> dangerous.  Therefore, the correct thing to do is - nothing but alert the
> operator.  These scenarios are what Host-HA was introduced for.
>
> Wrt to STONITH, if no disk activity is detected on the host, host-HA will
> try to restart (via IPMI) the host. If, after a configurable number of
> attempts, the host agent still does not check in, then host-HA will shut
> down the host (via IPMA), trigger VM-HA and mark the host as in-maintenance.
>
>
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> 

Re: KVM HostHA

2018-03-14 Thread Jon Marshall
That would make sense.


I have another server being used for something else at the moment so I will add 
that in and update this thread when I have tested


Jon



From: Paul Angus 
Sent: 14 March 2018 09:16
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

I'd need to do some testing, but I suspect that your problem is that you only 
have two hosts.  At the point that one host is deemed out of service, you only 
have one host left.  With only one host, CloudStack will show the cluster as 
ineligible.

It is extremely common for any system working as a cluster to require a minimum 
starting point of 3 nodes to be able to function.


Kind regards,

Paul Angus

paul.an...@shapeblue.com
www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-Original Message-
From: Jon Marshall 
Sent: 14 March 2018 08:36
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the 
VMs are never migrated. As I posted earlier the management server seems to be 
saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end 
result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so 
does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute 
nodes HA status goes from Available to Ineligible. Should this happen ie. is it 
going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the 
moment. If I do eventually get it working I will do a write up for others with 
same issue :)



From: Paul Angus 
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if 
the original host goes down.  The management server (without host-HA) cannot 
tell what happened to the host.  It cannot tell if there was a failure in the 
agent, loss of connectivity to the management NIC or if the host is truly down. 
 In the first two scenarios, the guest VMs can still be running perfectly well, 
and to restart them elsewhere would be very dangerous.  Therefore, the correct 
thing to do is - nothing but alert the operator.  These scenarios are what 
Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try 
to restart (via IPMI) the host. If, after a configurable number of attempts, 
the host agent still does not check in, then host-HA will shut down the host 
(via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.an...@shapeblue.com
www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-Original Message-
From: Parth Patel 
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware 
(in test environment), but it will be beneficial if you can help me clear out 
some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes 
down, what is the need or purpose of HA-host? (other than management server 
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS 
uses to fence the host, but I couldn't find what mechanism or events trigger 
this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus  wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there
> is no way for CloudStack to 'fence' the host - that is shut it down to
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com

RE: KVM HostHA

2018-03-14 Thread Paul Angus
I'd need to do some testing, but I suspect that your problem is that you only 
have two hosts.  At the point that one host is deemed out of service, you only 
have one host left.  With only one host, CloudStack will show the cluster as 
ineligible.

It is extremely common for any system working as a cluster to require a minimum 
starting point of 3 nodes to be able to function.


Kind regards,

Paul Angus

paul.an...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-Original Message-
From: Jon Marshall  
Sent: 14 March 2018 08:36
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the 
VMs are never migrated. As I posted earlier the management server seems to be 
saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end 
result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so 
does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute 
nodes HA status goes from Available to Ineligible. Should this happen ie. is it 
going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the 
moment. If I do eventually get it working I will do a write up for others with 
same issue :)



From: Paul Angus 
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if 
the original host goes down.  The management server (without host-HA) cannot 
tell what happened to the host.  It cannot tell if there was a failure in the 
agent, loss of connectivity to the management NIC or if the host is truly down. 
 In the first two scenarios, the guest VMs can still be running perfectly well, 
and to restart them elsewhere would be very dangerous.  Therefore, the correct 
thing to do is - nothing but alert the operator.  These scenarios are what 
Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try 
to restart (via IPMI) the host. If, after a configurable number of attempts, 
the host agent still does not check in, then host-HA will shut down the host 
(via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.an...@shapeblue.com
www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-Original Message-
From: Parth Patel 
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware 
(in test environment), but it will be beneficial if you can help me clear out 
some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes 
down, what is the need or purpose of HA-host? (other than management server 
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS 
uses to fence the host, but I couldn't find what mechanism or events trigger 
this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus  wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there 
> is no way for CloudStack to 'fence' the host - that is shut it down to 
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -Original Message-
> From: Parth Patel 
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall 
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the 

Re: KVM HostHA

2018-03-14 Thread victor

Hello Guys,

I think it is related

==
https://github.com/apache/cloudstack/pull/2474
===


On 03/14/2018 02:05 PM, Jon Marshall wrote:

Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the 
VMs are never migrated. As I posted earlier the management server seems to be 
saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end 
result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so 
does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute 
nodes HA status goes from Available to Ineligible. Should this happen ie. is it 
going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the 
moment. If I do eventually get it working I will do a write up for others with 
same issue :)



From: Paul Angus 
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if 
the original host goes down.  The management server (without host-HA) cannot 
tell what happened to the host.  It cannot tell if there was a failure in the 
agent, loss of connectivity to the management NIC or if the host is truly down. 
 In the first two scenarios, the guest VMs can still be running perfectly well, 
and to restart them elsewhere would be very dangerous.  Therefore, the correct 
thing to do is - nothing but alert the operator.  These scenarios are what 
Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try 
to restart (via IPMI) the host. If, after a configurable number of attempts, 
the host agent still does not check in, then host-HA will shut down the host 
(via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.an...@shapeblue.com
www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-Original Message-
From: Parth Patel 
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware 
(in test environment), but it will be beneficial if you can help me clear out 
some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes 
down, what is the need or purpose of HA-host? (other than management server 
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS 
uses to fence the host, but I couldn't find what mechanism or events trigger this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus  wrote:


The management server doesn't ping the host through IPMI.   However if
IPMI is not available, you will not be able to use Host HA, as there
is no way for CloudStack to 'fence' the host - that is shut it down to
be sure that a VM cannot start again on that host.

I can explain why that is necessary if you wish.


Kind regards,

Paul Angus

paul.an...@shapeblue.com
www.shapeblue.com

[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...




53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue




-Original Message-
From: Parth Patel 
Sent: 13 March 2018 16:57
To: users@cloudstack.apache.org
Cc: Jon Marshall 
Subject: Re: KVM HostHA

Hi Jon and Victor,

I think the management server pings your host using ipmi (I really don't
hope this is the case).
In my case, I did not have OOBM enabled at all (my hardware didn't support
it)
I think you could disable OOBM and/or HA-Host and give that a try :)

On Tue, 13 Mar 2018 at 20:40 victor  wrote:


Hello Guys,

I have tried the following two cases.

1, "echo c > /proc/sysrq-trigger"

2, Pulled the network cable of one of the host

In both cases, the following happened.

=
2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]

Re: KVM HostHA

2018-03-14 Thread Jon Marshall
Hi Paul


My testing does indeed end up with the failed host in maintenance mode but the 
VMs are never migrated. As I posted earlier the management server seems to be 
saying there is no other host that the VM can be migrated to.


Couple of questions if you have the time to respond -


1) this article seems to suggest a reboot or powering off a host will end 
result in the VMs being migrated and this was on CS v 4.2.1 back in 2013 so 
does Host HA do something different


2) Whenever one of my two nodes is taken down in testing the active compute 
nodes HA status goes from Available to Ineligible. Should this happen ie. is it 
going to Ineligible stopping the manager from migrating the VMs.


Apologies for all the questions but I just can't get this to work at the 
moment. If I do eventually get it working I will do a write up for others with 
same issue :)



From: Paul Angus 
Sent: 14 March 2018 07:45
To: users@cloudstack.apache.org
Subject: RE: KVM HostHA

Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if 
the original host goes down.  The management server (without host-HA) cannot 
tell what happened to the host.  It cannot tell if there was a failure in the 
agent, loss of connectivity to the management NIC or if the host is truly down. 
 In the first two scenarios, the guest VMs can still be running perfectly well, 
and to restart them elsewhere would be very dangerous.  Therefore, the correct 
thing to do is - nothing but alert the operator.  These scenarios are what 
Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try 
to restart (via IPMI) the host. If, after a configurable number of attempts, 
the host agent still does not check in, then host-HA will shut down the host 
(via IPMA), trigger VM-HA and mark the host as in-maintenance.



paul.an...@shapeblue.com
www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue




-Original Message-
From: Parth Patel 
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware 
(in test environment), but it will be beneficial if you can help me clear out 
some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes 
down, what is the need or purpose of HA-host? (other than management server 
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS 
uses to fence the host, but I couldn't find what mechanism or events trigger 
this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus  wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there
> is no way for CloudStack to 'fence' the host - that is shut it down to
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]

Shapeblue - The CloudStack Company
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a 
framework developed by ShapeBlue to deliver the rapid deployment of a 
standardised ...



> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -Original Message-
> From: Parth Patel 
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall 
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor  wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > 

RE: KVM HostHA

2018-03-14 Thread Paul Angus
Hi Parth,

Two answer your questions, VM-HA does not restart VMs on an alternate host if 
the original host goes down.  The management server (without host-HA) cannot 
tell what happened to the host.  It cannot tell if there was a failure in the 
agent, loss of connectivity to the management NIC or if the host is truly down. 
 In the first two scenarios, the guest VMs can still be running perfectly well, 
and to restart them elsewhere would be very dangerous.  Therefore, the correct 
thing to do is - nothing but alert the operator.  These scenarios are what 
Host-HA was introduced for.

Wrt to STONITH, if no disk activity is detected on the host, host-HA will try 
to restart (via IPMI) the host. If, after a configurable number of attempts, 
the host agent still does not check in, then host-HA will shut down the host 
(via IPMA), trigger VM-HA and mark the host as in-maintenance.

 

paul.an...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-Original Message-
From: Parth Patel  
Sent: 14 March 2018 05:05
To: users@cloudstack.apache.org
Subject: Re: KVM HostHA

Hi Paul,

Thanks for the clarification. I currently don't have an ipmi enabled hardware 
(in test environment), but it will be beneficial if you can help me clear out 
some basic concepts of it:
- If HA-enabled VMs are autostarted on another host when current host goes 
down, what is the need or purpose of HA-host? (other than management server 
able to remotely control it's power interfaces)
- I understood the "Shoot-the-other-node-in-the-head" (STONITH) approach ACS 
uses to fence the host, but I couldn't find what mechanism or events trigger 
this?

Thanks and regards,
Parth Patel

On Wed, 14 Mar 2018 at 02:22 Paul Angus  wrote:

> The management server doesn't ping the host through IPMI.   However if
> IPMI is not available, you will not be able to use Host HA, as there 
> is no way for CloudStack to 'fence' the host - that is shut it down to 
> be sure that a VM cannot start again on that host.
>
> I can explain why that is necessary if you wish.
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK @shapeblue
>
>
>
>
> -Original Message-
> From: Parth Patel 
> Sent: 13 March 2018 16:57
> To: users@cloudstack.apache.org
> Cc: Jon Marshall 
> Subject: Re: KVM HostHA
>
> Hi Jon and Victor,
>
> I think the management server pings your host using ipmi (I really don't
> hope this is the case).
> In my case, I did not have OOBM enabled at all (my hardware didn't support
> it)
> I think you could disable OOBM and/or HA-Host and give that a try :)
>
> On Tue, 13 Mar 2018 at 20:40 victor  wrote:
>
> > Hello Guys,
> >
> > I have tried the following two cases.
> >
> > 1, "echo c > /proc/sysrq-trigger"
> >
> > 2, Pulled the network cable of one of the host
> >
> > In both cases, the following happened.
> >
> > =
> > 2018-03-13 08:22:54,978 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentTaskPool-15:ctx-c8d9f5d2) (logid:c0a3d2da) Notifying other nodes
> > of to disconnect
> > 2018-03-13 08:22:54,983 INFO [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is
> > disconnecting with event AgentDisconnected
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Host 4 is already
> > Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Deregistering link
> > for
> > 4 with state Alert
> > 2018-03-13 08:22:54,985 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentTaskPool-16:ctx-d8204625) (logid:ffe4a426) Remove Agent : 4
> > =
> >
> > But nothing happened for the  vm's in that node. I have waited for one
> > hour and the VM's in that node has been migrated to the other
> > available hosts. I think the issue is that the management server still
> > thinks that the VM's in that host is running. Please check the
> > following logs
> >
> > ===
> > 2018-03-13 11:08:25,882 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 1 VMs on host 4
> > 2018-03-13 11:08:25,888 DEBUG [c.c.c.CapacityManagerImpl]
> > (CapacityChecker:ctx-1d8378af) (logid:ae906a50) Found 0 VM, not
> > running on host 4 
> >
> >
> > On 03/13/2018 04:20 PM, Jon Marshall wrote:
> > > I tried "echo c > /proc/sysrq-trigger" which stopped me getting into
> > > the
> > server but it did not stop the server responding to an ipmitool
> > request on the manager eg -
> > >
> > >
> > > "ipmitool -I lanplus -H 172.16.7.29 -U admin3 -P letmein chassis
> status"
> > >
> > >
> > > from the management server got an answer saying the chassis power
> > > was on
> > so CS never registered the compute node as down.
> > >
> > >
> > >