Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-25 Thread Géza Gémes


On 10/22/2014 10:05 PM, David Vossel wrote:


- Original Message -

On 10/21/2014 07:53 PM, David Vossel wrote:

- Original Message -

-Original Message-
From: Russell Bryant [mailto:rbry...@redhat.com]
Sent: October 21, 2014 15:07
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Nova] Automatic evacuate

On 10/21/2014 06:44 AM, Balázs Gibizer wrote:

Hi,

Sorry for the top posting but it was hard to fit my complete view
inline.

I'm also thinking about a possible solution for automatic server
evacuation. I see two separate sub problems of this problem:
1)compute node monitoring and fencing, 2)automatic server evacuation

Compute node monitoring is currently implemented in servicegroup
module of nova. As far as I understand pacemaker is the proposed
solution in this thread to solve both monitoring and fencing but we
tried and found out that pacemaker_remote on baremetal does not work
together with fencing (yet), see [1]. So if we need fencing then
either we have to go for normal pacemaker instead of pacemaker_remote
but that solution doesn't scale or we configure and call stonith
directly when pacemaker detect the compute node failure.

I didn't get the same conclusion from the link you reference.  It says:

That is not to say however that fencing of a baremetal node works any
differently than that of a normal cluster-node. The Pacemaker policy
engine
understands how to fence baremetal remote-nodes. As long as a fencing
device exists, the cluster is capable of ensuring baremetal nodes are
fenced
in the exact same way as normal cluster-nodes are fenced.

So, it sounds like the core pacemaker cluster can fence the node to me.
   I CC'd David Vossel, a pacemaker developer, to see if he can help
   clarify.

It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as
7.2
states:
 There are some complications involved with understanding a bare-metal
node's state that virtual nodes don't have. Once this logic is complete,
pacemaker will be able to integrate bare-metal nodes in the same way
virtual
remote-nodes currently are. Some special considerations for fencing will
need to be addressed. 
Let's wait for David's statement on this.

Hey, That's me!

I can definitely clear all this up.

First off, this document is out of sync with the current state upstream.
We're
already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
referenced is still talking about future v1.1.11 features.

I'll make it simple. If the document references anything that needs to be
done
in the future, it's already done.  Pacemaker remote is feature complete at
this
point. I've accomplished everything I originally set out to do. I see one
change
though. In 7.1 I talk about wanting pacemaker to be able to manage
resources in
containers. I mention something about libvirt sandbox. I scrapped whatever
I was
doing there. Pacemaker now has docker support.
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker

I've known this document is out of date. It's on my giant list of things to
do.
Sorry for any confusion.

As far as pacemaker remote and fencing goes, remote-nodes are fenced the
exact
same way as cluster-nodes. The only consideration that needs to be made is
that
the cluster-nodes (nodes running the full pacemaker+corosync stack) are the
only
nodes allowed to initiate fencing. All you have to do is make sure the
fencing
devices you want to use to fence remote-nodes are accessible to the
cluster-nodes.
  From there you are good to go.

Let me know if there's anything else I can clear up. Pacemaker remote was
designed
to be the solution for the exact scenario you all are discussing here.
Compute nodes
and pacemaker remote are made for one another :D

If anyone is interested in prototyping pacemaker remote for this compute
node use
case, make sure to include me. I have done quite a bit research into how to
maximize
pacemaker's ability to scale horizontally. As part of that research I've
made a few
changes that are directly related to all of this that are not yet in an
official
pacemaker release.  Come to me for the latest rpms and you'll have a less
painful
experience setting all this up :)

-- Vossel



Hi Vossel,

Could you send us a link to the source RPMs please, we have tested on
CentOS7. It might need a recompile.

Yes, centos 7.0 isn't going to have the rpms you need to test this.

There are a couple of things you can do.

1. I put the rhel7 related rpms I test with in this repo.
http://davidvossel.com/repo/os/el7/

*disclaimer* I only maintain this repo for myself. I'm not committed to keeping
it active or up-to-date. It just happens to be updated right now for my own use.

That will give you test rpms for the pacemaker version I'm currently using plus
the latest libqb. If you're going to do any sort of performance metrics you'll
need the latest libqb, v0.17.1

2. Build srpm from latest code on github. Right now master is relatively
stable

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-22 Thread David Vossel


- Original Message -
 On 10/21/2014 07:53 PM, David Vossel wrote:
 
  - Original Message -
  -Original Message-
  From: Russell Bryant [mailto:rbry...@redhat.com]
  Sent: October 21, 2014 15:07
  To: openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
  On 10/21/2014 06:44 AM, Balázs Gibizer wrote:
  Hi,
 
  Sorry for the top posting but it was hard to fit my complete view
  inline.
 
  I'm also thinking about a possible solution for automatic server
  evacuation. I see two separate sub problems of this problem:
  1)compute node monitoring and fencing, 2)automatic server evacuation
 
  Compute node monitoring is currently implemented in servicegroup
  module of nova. As far as I understand pacemaker is the proposed
  solution in this thread to solve both monitoring and fencing but we
  tried and found out that pacemaker_remote on baremetal does not work
  together with fencing (yet), see [1]. So if we need fencing then
  either we have to go for normal pacemaker instead of pacemaker_remote
  but that solution doesn't scale or we configure and call stonith
  directly when pacemaker detect the compute node failure.
  I didn't get the same conclusion from the link you reference.  It says:
 
  That is not to say however that fencing of a baremetal node works any
  differently than that of a normal cluster-node. The Pacemaker policy
  engine
  understands how to fence baremetal remote-nodes. As long as a fencing
  device exists, the cluster is capable of ensuring baremetal nodes are
  fenced
  in the exact same way as normal cluster-nodes are fenced.
 
  So, it sounds like the core pacemaker cluster can fence the node to me.
I CC'd David Vossel, a pacemaker developer, to see if he can help
clarify.
  It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as
  7.2
  states:
   There are some complications involved with understanding a bare-metal
  node's state that virtual nodes don't have. Once this logic is complete,
  pacemaker will be able to integrate bare-metal nodes in the same way
  virtual
  remote-nodes currently are. Some special considerations for fencing will
  need to be addressed. 
  Let's wait for David's statement on this.
  Hey, That's me!
 
  I can definitely clear all this up.
 
  First off, this document is out of sync with the current state upstream.
  We're
  already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
  referenced is still talking about future v1.1.11 features.
 
  I'll make it simple. If the document references anything that needs to be
  done
  in the future, it's already done.  Pacemaker remote is feature complete at
  this
  point. I've accomplished everything I originally set out to do. I see one
  change
  though. In 7.1 I talk about wanting pacemaker to be able to manage
  resources in
  containers. I mention something about libvirt sandbox. I scrapped whatever
  I was
  doing there. Pacemaker now has docker support.
  https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker
 
  I've known this document is out of date. It's on my giant list of things to
  do.
  Sorry for any confusion.
 
  As far as pacemaker remote and fencing goes, remote-nodes are fenced the
  exact
  same way as cluster-nodes. The only consideration that needs to be made is
  that
  the cluster-nodes (nodes running the full pacemaker+corosync stack) are the
  only
  nodes allowed to initiate fencing. All you have to do is make sure the
  fencing
  devices you want to use to fence remote-nodes are accessible to the
  cluster-nodes.
   From there you are good to go.
 
  Let me know if there's anything else I can clear up. Pacemaker remote was
  designed
  to be the solution for the exact scenario you all are discussing here.
  Compute nodes
  and pacemaker remote are made for one another :D
 
  If anyone is interested in prototyping pacemaker remote for this compute
  node use
  case, make sure to include me. I have done quite a bit research into how to
  maximize
  pacemaker's ability to scale horizontally. As part of that research I've
  made a few
  changes that are directly related to all of this that are not yet in an
  official
  pacemaker release.  Come to me for the latest rpms and you'll have a less
  painful
  experience setting all this up :)
 
  -- Vossel
 
 
 Hi Vossel,
 
 Could you send us a link to the source RPMs please, we have tested on
 CentOS7. It might need a recompile.

Yes, centos 7.0 isn't going to have the rpms you need to test this.

There are a couple of things you can do.

1. I put the rhel7 related rpms I test with in this repo.
http://davidvossel.com/repo/os/el7/

*disclaimer* I only maintain this repo for myself. I'm not committed to keeping
it active or up-to-date. It just happens to be updated right now for my own use.

That will give you test rpms for the pacemaker version I'm currently using plus
the latest libqb. If you're going to do any

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread Balázs Gibizer
Hi, 

Sorry for the top posting but it was hard to fit my complete view inline.

I'm also thinking about a possible solution for automatic server evacuation. I 
see two separate sub problems of this problem: 1)compute node monitoring and 
fencing, 2)automatic server evacuation

Compute node monitoring is currently implemented in servicegroup module of 
nova. As far as I understand pacemaker is the proposed solution in this thread 
to solve both monitoring and fencing but we tried and found out that 
pacemaker_remote on baremetal does not work together with fencing (yet), see 
[1]. So if we need fencing then either we have to go for normal pacemaker 
instead of pacemaker_remote but that solution doesn't scale or we configure and 
call stonith directly when pacemaker detect the compute node failure. We can 
create a pacemaker driver for servicegroup and that driver can hide this 
currently missing pacemaker functionality by calling stonith directly today and 
remove this extra functionality as soon as pacemaker itself is capable of doing 
it. However this means that the service group driver has to know the stonith 
configuration of the compute nodes.
My another concern with pacemaker is that the up state of the resource 
represents the compute node does not automatically mean that the nova-compute 
service is also up and running on that compute node. So we have to ask the 
deployer of the compute node to configure the nova-compute service in pacemaker 
in a way that the nova-compute service is a pacemaker resource tied to the 
compute node. Without this configuration change another possibility would be to 
calculate the up state of a compute service by evaluating a logical operator on 
a coupled set of sources (e.g. service state in DB AND pacemaker state of the 
node).
 
For automatic server evacuation we need a piece of code that gets information 
about the state of the compute nodes periodically and calls the nova evacuation 
command if necessary. Today the information source of the compute node state is 
the servicegroup API so the evacuation engine has to be part of nova or the 
servicegroup API needs to be made available from outside of nova. For me adding 
the evacuation engine to nova looks simpler than externalizing the servicegroup 
API. 
Today the nova evacuate command expects the information about that the server 
is on shared storage or not. So to be able to automatically call evacuate we 
also need to automatically determine if the server is on shared storage or not. 
Also we can consider persisting some of the scheduler hints for example the 
group hint used by the ServerGroupAntiAffinityFilter as proposed in [2]

The new pacemaker servicegroup driver can be implemented first then we can add 
the evacuation engine as a next step. I'm happy to help with the BP work and 
the implementation of the feature. 

[1] 
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_baremetal_remote_node_use_case
 
[2] 
https://blueprints.launchpad.net/nova/+spec/validate-targethost-live-migration

Cheers, 
Gibi

 -Original Message-
 From: Jastrzebski, Michal [mailto:michal.jastrzeb...@intel.com]
 Sent: October 18, 2014 09:09
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 
 
  -Original Message-
  From: Florian Haas [mailto:flor...@hastexo.com]
  Sent: Friday, October 17, 2014 1:49 PM
  To: OpenStack Development Mailing List (not for usage questions)
  Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
  On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal
  michal.jastrzeb...@intel.com wrote:
  
  
   -Original Message-
   From: Florian Haas [mailto:flor...@hastexo.com]
   Sent: Thursday, October 16, 2014 10:53 AM
   To: OpenStack Development Mailing List (not for usage questions)
   Subject: Re: [openstack-dev] [Nova] Automatic evacuate
  
   On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
   michal.jastrzeb...@intel.com wrote:
In my opinion flavor defining is a bit hacky. Sure, it will
provide us functionality fairly quickly, but also will strip us
from flexibility Heat would give. Healing can be done in several
ways, simple destroy
- create (basic convergence workflow so far), evacuate with or
without shared storage, even rebuild vm, probably few more when
we put more thoughts to it.
  
   But then you'd also need to monitor the availability of
   *individual* guest and down you go the rabbit hole.
  
   So suppose you're monitoring a guest with a simple ping. And it
   stops responding to that ping.
  
   I was more reffering to monitoring host (not guest), and for sure
   not by
  ping.
   I was thinking of current zookeeper service group implementation, we
   might want to use corosync and write servicegroup plugin for that.
   There are several choices for that, each requires testing really
   before we
  make any decission

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread Russell Bryant
On 10/21/2014 06:44 AM, Balázs Gibizer wrote:
 Hi, 
 
 Sorry for the top posting but it was hard to fit my complete view inline.
 
 I'm also thinking about a possible solution for automatic server
 evacuation. I see two separate sub problems of this problem:
 1)compute node monitoring and fencing, 2)automatic server evacuation
 
 Compute node monitoring is currently implemented in servicegroup
 module of nova. As far as I understand pacemaker is the proposed
 solution in this thread to solve both monitoring and fencing but we
 tried and found out that pacemaker_remote on baremetal does not work
 together with fencing (yet), see [1]. So if we need fencing then
 either we have to go for normal pacemaker instead of pacemaker_remote
 but that solution doesn't scale or we configure and call stonith
 directly when pacemaker detect the compute node failure. 

I didn't get the same conclusion from the link you reference.  It says:

That is not to say however that fencing of a baremetal node works any
differently than that of a normal cluster-node. The Pacemaker policy
engine understands how to fence baremetal remote-nodes. As long as a
fencing device exists, the cluster is capable of ensuring baremetal
nodes are fenced in the exact same way as normal cluster-nodes are fenced.

So, it sounds like the core pacemaker cluster can fence the node to me.
 I CC'd David Vossel, a pacemaker developer, to see if he can help clarify.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread Balázs Gibizer
 -Original Message-
 From: Russell Bryant [mailto:rbry...@redhat.com]
 Sent: October 21, 2014 15:07
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On 10/21/2014 06:44 AM, Balázs Gibizer wrote:
  Hi,
 
  Sorry for the top posting but it was hard to fit my complete view inline.
 
  I'm also thinking about a possible solution for automatic server
  evacuation. I see two separate sub problems of this problem:
  1)compute node monitoring and fencing, 2)automatic server evacuation
 
  Compute node monitoring is currently implemented in servicegroup
  module of nova. As far as I understand pacemaker is the proposed
  solution in this thread to solve both monitoring and fencing but we
  tried and found out that pacemaker_remote on baremetal does not work
  together with fencing (yet), see [1]. So if we need fencing then
  either we have to go for normal pacemaker instead of pacemaker_remote
  but that solution doesn't scale or we configure and call stonith
  directly when pacemaker detect the compute node failure.
 
 I didn't get the same conclusion from the link you reference.  It says:
 
 That is not to say however that fencing of a baremetal node works any
 differently than that of a normal cluster-node. The Pacemaker policy engine
 understands how to fence baremetal remote-nodes. As long as a fencing
 device exists, the cluster is capable of ensuring baremetal nodes are fenced
 in the exact same way as normal cluster-nodes are fenced.
 
 So, it sounds like the core pacemaker cluster can fence the node to me.
  I CC'd David Vossel, a pacemaker developer, to see if he can help clarify.

It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2 
states:
 There are some complications involved with understanding a bare-metal node's 
state that virtual nodes don't have. Once this logic is complete, pacemaker 
will be able to integrate bare-metal nodes in the same way virtual remote-nodes 
currently are. Some special considerations for fencing will need to be 
addressed. 
Let's wait for David's statement on this. 

Cheers,
Gibi

 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread David Vossel


- Original Message -
  -Original Message-
  From: Russell Bryant [mailto:rbry...@redhat.com]
  Sent: October 21, 2014 15:07
  To: openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [Nova] Automatic evacuate
  
  On 10/21/2014 06:44 AM, Balázs Gibizer wrote:
   Hi,
  
   Sorry for the top posting but it was hard to fit my complete view inline.
  
   I'm also thinking about a possible solution for automatic server
   evacuation. I see two separate sub problems of this problem:
   1)compute node monitoring and fencing, 2)automatic server evacuation
  
   Compute node monitoring is currently implemented in servicegroup
   module of nova. As far as I understand pacemaker is the proposed
   solution in this thread to solve both monitoring and fencing but we
   tried and found out that pacemaker_remote on baremetal does not work
   together with fencing (yet), see [1]. So if we need fencing then
   either we have to go for normal pacemaker instead of pacemaker_remote
   but that solution doesn't scale or we configure and call stonith
   directly when pacemaker detect the compute node failure.
  
  I didn't get the same conclusion from the link you reference.  It says:
  
  That is not to say however that fencing of a baremetal node works any
  differently than that of a normal cluster-node. The Pacemaker policy engine
  understands how to fence baremetal remote-nodes. As long as a fencing
  device exists, the cluster is capable of ensuring baremetal nodes are
  fenced
  in the exact same way as normal cluster-nodes are fenced.
  
  So, it sounds like the core pacemaker cluster can fence the node to me.
   I CC'd David Vossel, a pacemaker developer, to see if he can help clarify.
 
 It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2
 states:
  There are some complications involved with understanding a bare-metal
 node's state that virtual nodes don't have. Once this logic is complete,
 pacemaker will be able to integrate bare-metal nodes in the same way virtual
 remote-nodes currently are. Some special considerations for fencing will
 need to be addressed. 
 Let's wait for David's statement on this.

Hey, That's me!

I can definitely clear all this up.

First off, this document is out of sync with the current state upstream. We're
already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
referenced is still talking about future v1.1.11 features.

I'll make it simple. If the document references anything that needs to be done
in the future, it's already done.  Pacemaker remote is feature complete at this
point. I've accomplished everything I originally set out to do. I see one change
though. In 7.1 I talk about wanting pacemaker to be able to manage resources in
containers. I mention something about libvirt sandbox. I scrapped whatever I was
doing there. Pacemaker now has docker support.
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker

I've known this document is out of date. It's on my giant list of things to do.
Sorry for any confusion.

As far as pacemaker remote and fencing goes, remote-nodes are fenced the exact
same way as cluster-nodes. The only consideration that needs to be made is that
the cluster-nodes (nodes running the full pacemaker+corosync stack) are the only
nodes allowed to initiate fencing. All you have to do is make sure the fencing
devices you want to use to fence remote-nodes are accessible to the 
cluster-nodes.
From there you are good to go.

Let me know if there's anything else I can clear up. Pacemaker remote was 
designed
to be the solution for the exact scenario you all are discussing here.  Compute 
nodes
and pacemaker remote are made for one another :D

If anyone is interested in prototyping pacemaker remote for this compute node 
use
case, make sure to include me. I have done quite a bit research into how to 
maximize
pacemaker's ability to scale horizontally. As part of that research I've made a 
few
changes that are directly related to all of this that are not yet in an official
pacemaker release.  Come to me for the latest rpms and you'll have a less 
painful
experience setting all this up :)

-- Vossel







 
 Cheers,
 Gibi
 
  
  --
  Russell Bryant
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread Géza Gémes

On 10/21/2014 07:53 PM, David Vossel wrote:


- Original Message -

-Original Message-
From: Russell Bryant [mailto:rbry...@redhat.com]
Sent: October 21, 2014 15:07
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Nova] Automatic evacuate

On 10/21/2014 06:44 AM, Balázs Gibizer wrote:

Hi,

Sorry for the top posting but it was hard to fit my complete view inline.

I'm also thinking about a possible solution for automatic server
evacuation. I see two separate sub problems of this problem:
1)compute node monitoring and fencing, 2)automatic server evacuation

Compute node monitoring is currently implemented in servicegroup
module of nova. As far as I understand pacemaker is the proposed
solution in this thread to solve both monitoring and fencing but we
tried and found out that pacemaker_remote on baremetal does not work
together with fencing (yet), see [1]. So if we need fencing then
either we have to go for normal pacemaker instead of pacemaker_remote
but that solution doesn't scale or we configure and call stonith
directly when pacemaker detect the compute node failure.

I didn't get the same conclusion from the link you reference.  It says:

That is not to say however that fencing of a baremetal node works any
differently than that of a normal cluster-node. The Pacemaker policy engine
understands how to fence baremetal remote-nodes. As long as a fencing
device exists, the cluster is capable of ensuring baremetal nodes are
fenced
in the exact same way as normal cluster-nodes are fenced.

So, it sounds like the core pacemaker cluster can fence the node to me.
  I CC'd David Vossel, a pacemaker developer, to see if he can help clarify.

It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2
states:
 There are some complications involved with understanding a bare-metal
node's state that virtual nodes don't have. Once this logic is complete,
pacemaker will be able to integrate bare-metal nodes in the same way virtual
remote-nodes currently are. Some special considerations for fencing will
need to be addressed. 
Let's wait for David's statement on this.

Hey, That's me!

I can definitely clear all this up.

First off, this document is out of sync with the current state upstream. We're
already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
referenced is still talking about future v1.1.11 features.

I'll make it simple. If the document references anything that needs to be done
in the future, it's already done.  Pacemaker remote is feature complete at this
point. I've accomplished everything I originally set out to do. I see one change
though. In 7.1 I talk about wanting pacemaker to be able to manage resources in
containers. I mention something about libvirt sandbox. I scrapped whatever I was
doing there. Pacemaker now has docker support.
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker

I've known this document is out of date. It's on my giant list of things to do.
Sorry for any confusion.

As far as pacemaker remote and fencing goes, remote-nodes are fenced the exact
same way as cluster-nodes. The only consideration that needs to be made is that
the cluster-nodes (nodes running the full pacemaker+corosync stack) are the only
nodes allowed to initiate fencing. All you have to do is make sure the fencing
devices you want to use to fence remote-nodes are accessible to the 
cluster-nodes.
 From there you are good to go.

Let me know if there's anything else I can clear up. Pacemaker remote was 
designed
to be the solution for the exact scenario you all are discussing here.  Compute 
nodes
and pacemaker remote are made for one another :D

If anyone is interested in prototyping pacemaker remote for this compute node 
use
case, make sure to include me. I have done quite a bit research into how to 
maximize
pacemaker's ability to scale horizontally. As part of that research I've made a 
few
changes that are directly related to all of this that are not yet in an official
pacemaker release.  Come to me for the latest rpms and you'll have a less 
painful
experience setting all this up :)

-- Vossel



Hi Vossel,

Could you send us a link to the source RPMs please, we have tested on 
CentOS7. It might need a recompile.


Thank you!

Geza






Cheers,
Gibi


--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-21 Thread David Vossel


- Original Message -
 On Thu, Oct 16, 2014 at 7:48 PM, Jay Pipes jaypi...@gmail.com wrote:
  While one of us (Jay or me) speaking for the other and saying we agree
  is a distributed consensus problem that dwarfs the complexity of
  Paxos
 
 
  You've always had a way with words, Florian :)
 
 I knew you'd like that one. :)
 
 , *I* for my part do think that an external toolset (i.e. one
 
  that lives outside the Nova codebase) is the better approach versus
  duplicating the functionality of said toolset in Nova.
 
  I just believe that the toolset that should be used here is
  Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
  approach leads to *much* fewer necessary code changes *in* Nova than
  the latter.
 
 
  I agree with you that Corosync/Pacemaker is the tool of choice for
  monitoring/heartbeat functionality, and is my choice for compute-node-level
  HA monitoring. For guest-level HA monitoring, I would say use
  Heat/Ceilometer. For container-level HA monitoring, it looks like fleet or
  something like Kubernetes would be a good option.
 
 Here's why I think that's a bad idea: none of these support the
 concept of being subordinate to another cluster.
 
 Again, suppose a VM stops responding. Then
 Heat/Ceilometer/Kubernetes/fleet would need to know whether the node
 hosting the VM is down or not. Only if the node is up or recovered
 (which Pacemaker would be reponsible for) the VM HA facility would be
 able to kick in. Effectively you have two views of the cluster
 membership, and that sort of thing always gets messy. In the HA space
 we're always facing the same issues when a replication facility
 (Galera, GlusterFS, DRBD, whatever) has a different view of the
 cluster membership than the cluster manager itself — which *always*
 happens for a few seconds on any failover, recovery, or fencing event.
 
 Russell's suggestion, by having remote Pacemaker instances on the
 compute nodes tie in with a Pacemaker cluster on the control nodes,
 does away with that discrepancy.
 
  I'm curious to see how the combination of compute-node-level HA and
  container-level HA tools will work together in some of the proposed
  deployment architectures (bare metal + docker containers w/ OpenStack and
  infrastructure services run in a Kubernetes pod or CoreOS fleet).
 
 I have absolutely nothing against an OpenStack cluster using
 *exclusively* Kubernetes or fleet for HA management, once those have
 reached sufficient maturity.

It's not about reaching sufficient maturity for these two projects. They are
on the wrong path to achieve proper HA. Kubernetes and fleet (i'll throw geard
into the mix as well) do a great job at distributed management of containers.
The  difference is instead of integrating with a proper HA stack (like Nova is)
kubernetes and fleet are attempting their own HA. In doing this, they've
unknowingly blown the scope of their respective projects way beyond what they
originally set out to do.

Here's the problem. HA is both very misunderstood and deceivingly difficult to
achieve. System wide deterministic failover behavior is not a matter of 
monitoring
and restarting failed containers. For kubernetes and fleet to succeed, they will
need to integrate with a proper HA stack like pacemaker.

Below are some presentation slides on how I envision pacemaker interacting with
container orchestration tools.

https://github.com/davidvossel/phd/blob/master/doc/presentations/HA_Container_Overview_David_Vossel.pdf?raw=true

-- Vossel

 But just about every significant
 OpenStack distro out there has settled on Corosync/Pacemaker for the
 time being. Let's not shove another cluster manager down their throats
 for little to no real benefit.
 
 Cheers,
 Florian
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Jastrzebski, Michal


 -Original Message-
 From: Florian Haas [mailto:flor...@hastexo.com]
 Sent: Thursday, October 16, 2014 10:53 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
 michal.jastrzeb...@intel.com wrote:
  In my opinion flavor defining is a bit hacky. Sure, it will provide us
  functionality fairly quickly, but also will strip us from flexibility
  Heat would give. Healing can be done in several ways, simple destroy
  - create (basic convergence workflow so far), evacuate with or
  without shared storage, even rebuild vm, probably few more when we put
  more thoughts to it.
 
 But then you'd also need to monitor the availability of *individual* guest and
 down you go the rabbit hole.
 
 So suppose you're monitoring a guest with a simple ping. And it stops
 responding to that ping.

I was more reffering to monitoring host (not guest), and for sure not by ping.
I was thinking of current zookeeper service group implementation, we might want
to use corosync and write servicegroup plugin for that. There are several 
choices
for that, each requires testing really before we make any decission.

There is also fencing case, which we agree is important, and I think nova should
be able to do that (since it does evacuate, it also should do a fencing). But
for working fencing we really need working host health monitoring, so I suggest
we take baby steps here and solve one issue at the time. And that would be host
monitoring.

 (1) Has it died?
 (2) Is it just too busy to respond to the ping?
 (3) Has its guest network stack died?
 (4) Has its host vif died?
 (5) Has the L2 agent on the compute host died?
 (6) Has its host network stack died?
 (7) Has the compute host died?
 
 Suppose further it's using shared storage (running off an RBD volume or
 using an iSCSI volume, or whatever). Now you have almost as many recovery
 options as possible causes for the failure, and some of those recovery
 options will potentially destroy your guest's data.
 
 No matter how you twist and turn the problem, you need strongly consistent
 distributed VM state plus fencing. In other words, you need a full blown HA
 stack.
 
  I'd rather use nova for low level task and maybe low level monitoring
  (imho nova should do that using servicegroup). But I'd use something
  more more configurable for actual task triggering like heat. That
  would give us framework rather than mechanism. Later we might want to
  apply HA on network or volume, then we'll have mechanism ready just
  monitoring hook and healing will need to be implemented.
 
  We can use scheduler hints to place resource on host HA-compatible
  (whichever health action we'd like to use), this will bit more
  complicated, but also will give us more flexibility.
 
 I apologize in advance for my bluntness, but this all sounds to me like you're
 vastly underrating the problem of reliable guest state detection and
 recovery. :)

Guest health in my opinion is just a bit out of scope here. If we'll have robust
way of detecting host health, we can pretty much asume that if host dies, 
guests follow.
There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
mentioned),
but that should be done somewhere else. And for sure not by evacuation.

 
  I agree that we all should meet in Paris and discuss that so we can
  join our forces. This is one of bigger gaps to be filled imho.
 
 Pretty much every user I've worked with in the last 2 years agrees.
 Granted, my view may be skewed as HA is typically what customers approach
 us for in the first place, but yes, this definitely needs a globally 
 understood
 and supported solution.
 
 Cheers,
 Florian
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Florian Haas
On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal
michal.jastrzeb...@intel.com wrote:


 -Original Message-
 From: Florian Haas [mailto:flor...@hastexo.com]
 Sent: Thursday, October 16, 2014 10:53 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate

 On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
 michal.jastrzeb...@intel.com wrote:
  In my opinion flavor defining is a bit hacky. Sure, it will provide us
  functionality fairly quickly, but also will strip us from flexibility
  Heat would give. Healing can be done in several ways, simple destroy
  - create (basic convergence workflow so far), evacuate with or
  without shared storage, even rebuild vm, probably few more when we put
  more thoughts to it.

 But then you'd also need to monitor the availability of *individual* guest 
 and
 down you go the rabbit hole.

 So suppose you're monitoring a guest with a simple ping. And it stops
 responding to that ping.

 I was more reffering to monitoring host (not guest), and for sure not by ping.
 I was thinking of current zookeeper service group implementation, we might 
 want
 to use corosync and write servicegroup plugin for that. There are several 
 choices
 for that, each requires testing really before we make any decission.

 There is also fencing case, which we agree is important, and I think nova 
 should
 be able to do that (since it does evacuate, it also should do a fencing). But
 for working fencing we really need working host health monitoring, so I 
 suggest
 we take baby steps here and solve one issue at the time. And that would be 
 host
 monitoring.

You're describing all of the cases for which Pacemaker is the perfect
fit. Sorry, I see absolutely no point in teaching Nova to do that.

 (1) Has it died?
 (2) Is it just too busy to respond to the ping?
 (3) Has its guest network stack died?
 (4) Has its host vif died?
 (5) Has the L2 agent on the compute host died?
 (6) Has its host network stack died?
 (7) Has the compute host died?

 Suppose further it's using shared storage (running off an RBD volume or
 using an iSCSI volume, or whatever). Now you have almost as many recovery
 options as possible causes for the failure, and some of those recovery
 options will potentially destroy your guest's data.

 No matter how you twist and turn the problem, you need strongly consistent
 distributed VM state plus fencing. In other words, you need a full blown HA
 stack.

  I'd rather use nova for low level task and maybe low level monitoring
  (imho nova should do that using servicegroup). But I'd use something
  more more configurable for actual task triggering like heat. That
  would give us framework rather than mechanism. Later we might want to
  apply HA on network or volume, then we'll have mechanism ready just
  monitoring hook and healing will need to be implemented.
 
  We can use scheduler hints to place resource on host HA-compatible
  (whichever health action we'd like to use), this will bit more
  complicated, but also will give us more flexibility.

 I apologize in advance for my bluntness, but this all sounds to me like 
 you're
 vastly underrating the problem of reliable guest state detection and
 recovery. :)

 Guest health in my opinion is just a bit out of scope here. If we'll have 
 robust
 way of detecting host health, we can pretty much asume that if host dies, 
 guests follow.
 There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
 mentioned),
 but that should be done somewhere else. And for sure not by evacuation.

You're making an important point here; you're asking for a robust way
of detecting host health. I can guarantee you that the way of
detecting host health that you suggest (i.e. from within Nova) will
not be robust by HA standards for at least two years, if your patch
lands tomorrow.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Jastrzebski, Michal


 -Original Message-
 From: Russell Bryant [mailto:rbry...@redhat.com]
 Sent: Thursday, October 16, 2014 5:04 AM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On 10/15/2014 05:07 PM, Florian Haas wrote:
  On Wed, Oct 15, 2014 at 10:03 PM, Russell Bryant rbry...@redhat.com
 wrote:
  Am I making sense?
 
  Yep, the downside is just that you need to provide a new set of
  flavors for ha vs non-ha.  A benefit though is that it's a way to
  support it today without *any* changes to OpenStack.
 
  Users are already very used to defining new flavors. Nova itself
  wouldn't even need to define those; if the vendor's deployment tools
  defined them it would be just fine.
 
 Yes, I know Nova wouldn't need to define it.  I was saying I didn't like that 
 it
 was required at all.
 
  This seems like the kind of thing we should also figure out how to
  offer on a per-guest basis without needing a new set of flavors.
  That's why I also listed the server tagging functionality as another 
  possible
 solution.
 
  This still doesn't do away with the requirement to reliably detect
  node failure, and to fence misbehaving nodes. Detecting that a node
  has failed, and fencing it if unsure, is a prerequisite for any
  recovery action. So you need Corosync/Pacemaker anyway.
 
 Obviously, yes.  My post covered all of that directly ... the tagging bit was 
 just
 additional input into the recovery operation.
 
  Note also that when using an approach where you have physically
  clustered nodes, but you are also running non-HA VMs on those, then
  the user must understand that the following applies:
 
  (1) If your guest is marked HA, then it will automatically recover on
  node failure, but
  (2) if your guest is *not* marked HA, then it will go down with the
  node not only if it fails, but also if it is fenced.
 
  So a non-HA guest on an HA node group actually has a slightly
  *greater* chance of going down than a non-HA guest on a non-HA host.
  (And let's not get into don't use fencing then; we all know why
  that's a bad idea.)
 
  Which is why I think it makes sense to just distinguish between
  HA-capable and non-HA-capable hosts, and have the user decide whether
  they want HA or non-HA guests simply by assigning them to the
  appropriate host aggregates.
 
 Very good point.  I hadn't considered that.
 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

In my opinion flavor defining is a bit hacky. Sure, it will provide us
functionality fairly quickly, but also will strip us from flexibility Heat
would give. Healing can be done in several ways, simple destroy - create
(basic convergence workflow so far), evacuate with or without
shared storage, even rebuild vm, probably few more when we put more thoughts
to it.

I'd rather use nova for low level task and maybe low level monitoring (imho
nova should do that using servicegroup). But I'd use something more more
configurable for actual task triggering like heat. That would give us
framework rather than mechanism. Later we might want to apply HA on network or
volume, then we'll have mechanism ready just monitoring hook and healing
will need to be implemented.
We can use scheduler hints to place resource on host HA-compatible 
(whichever health action we'd like to use), this will bit more complicated, but
also will give us more flexibility.

I agree that we all should meet in Paris and discuss that so we can join our
forces. This is one of bigger gaps to be filled imho.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Sylvain Bauza


Le 16/10/2014 05:04, Russell Bryant a écrit :

On 10/15/2014 05:07 PM, Florian Haas wrote:

On Wed, Oct 15, 2014 at 10:03 PM, Russell Bryant rbry...@redhat.com wrote:

Am I making sense?

Yep, the downside is just that you need to provide a new set of flavors
for ha vs non-ha.  A benefit though is that it's a way to support it
today without *any* changes to OpenStack.

Users are already very used to defining new flavors. Nova itself
wouldn't even need to define those; if the vendor's deployment tools
defined them it would be just fine.

Yes, I know Nova wouldn't need to define it.  I was saying I didn't like
that it was required at all.


This seems like the kind of thing we should also figure out how to offer
on a per-guest basis without needing a new set of flavors.  That's why I
also listed the server tagging functionality as another possible solution.

This still doesn't do away with the requirement to reliably detect
node failure, and to fence misbehaving nodes. Detecting that a node
has failed, and fencing it if unsure, is a prerequisite for any
recovery action. So you need Corosync/Pacemaker anyway.

Obviously, yes.  My post covered all of that directly ... the tagging
bit was just additional input into the recovery operation.


Note also that when using an approach where you have physically
clustered nodes, but you are also running non-HA VMs on those, then
the user must understand that the following applies:

(1) If your guest is marked HA, then it will automatically recover on
node failure, but
(2) if your guest is *not* marked HA, then it will go down with the
node not only if it fails, but also if it is fenced.

So a non-HA guest on an HA node group actually has a slightly
*greater* chance of going down than a non-HA guest on a non-HA host.
(And let's not get into don't use fencing then; we all know why
that's a bad idea.)

Which is why I think it makes sense to just distinguish between
HA-capable and non-HA-capable hosts, and have the user decide whether
they want HA or non-HA guests simply by assigning them to the
appropriate host aggregates.

Very good point.  I hadn't considered that.



There are various possibilities for handling that usecase, and tagging 
VMs on a case-by-case basis sounds really good to me.
What is missing IMHO is a smart filter able to mitigate spread across 
computes and VMs asking for HA. We can actually do this thanks to 
Instance Groups and affinity filters, but in a certain sense, it would 
be cool if an user could just boot an instance and ask for a policy 
(either given by a flavor metadata or whatever) without having knowledge 
on the subsequent infra.


-Sylvain

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
 (5) Let monitoring and orchestration services deal with these use
 cases and
 have Nova simply provide the primitive API calls that it already does
 (i.e.
 host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.

 How so? (5) would use existing wheels for monitoring and orchestration
 instead of writing all new code paths inside Nova to do the same thing.

 Right, there may be some confusion here ... I thought you were both
 agreeing that the use of an external toolset was a good approach for the
 problem, but Florian's last message makes that not so clear ...

While one of us (Jay or me) speaking for the other and saying we agree
is a distributed consensus problem that dwarfs the complexity of
Paxos, *I* for my part do think that an external toolset (i.e. one
that lives outside the Nova codebase) is the better approach versus
duplicating the functionality of said toolset in Nova.

I just believe that the toolset that should be used here is
Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
approach leads to *much* fewer necessary code changes *in* Nova than
the latter.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 5:04 AM, Russell Bryant rbry...@redhat.com wrote:
 On 10/15/2014 05:07 PM, Florian Haas wrote:
 On Wed, Oct 15, 2014 at 10:03 PM, Russell Bryant rbry...@redhat.com wrote:
 Am I making sense?

 Yep, the downside is just that you need to provide a new set of flavors
 for ha vs non-ha.  A benefit though is that it's a way to support it
 today without *any* changes to OpenStack.

 Users are already very used to defining new flavors. Nova itself
 wouldn't even need to define those; if the vendor's deployment tools
 defined them it would be just fine.

 Yes, I know Nova wouldn't need to define it.  I was saying I didn't like
 that it was required at all.

Fair enough, but do consider that, for example, Trove already
routinely defines flavors of its own.

So I don't think that's quite as painful (to users) as you think.

 This seems like the kind of thing we should also figure out how to offer
 on a per-guest basis without needing a new set of flavors.  That's why I
 also listed the server tagging functionality as another possible solution.

 This still doesn't do away with the requirement to reliably detect
 node failure, and to fence misbehaving nodes. Detecting that a node
 has failed, and fencing it if unsure, is a prerequisite for any
 recovery action. So you need Corosync/Pacemaker anyway.

 Obviously, yes.  My post covered all of that directly ... the tagging
 bit was just additional input into the recovery operation.

This is essentially why I am saying using the Pacemaker stack is the
smarter approach than hacking something into Ceilometer and Heat. You
already need Pacemaker for service availability (and all major vendors
have adopted it for that purpose), so a highly available cloud that
does *not* use Pacemaker at all won't be a vendor supported option for
some time. So people will already be running Pacemaker — then why not
use it for what it's good at?

(Yes, I am aware of things like etcd and fleet. I think that's headed
in the right direction, but hasn't nearly achieved the degree of
maturity that Pacemaker has. All of HA is about performing correctly
in weird corner cases, and you're only able to do that if you've run
into them and got your nose bloody.)

And just so my position is clear, what Pacemaker is good at is node
and service monitoring, recovery, and fencing. It's *not* particularly
good at usability. Which is why it makes perfect sense to not have
your Pacemaker configurations managed directly by a human, but have an
automated deployment facility do it. Which the vendors are already
doing.

 Note also that when using an approach where you have physically
 clustered nodes, but you are also running non-HA VMs on those, then
 the user must understand that the following applies:

 (1) If your guest is marked HA, then it will automatically recover on
 node failure, but
 (2) if your guest is *not* marked HA, then it will go down with the
 node not only if it fails, but also if it is fenced.

 So a non-HA guest on an HA node group actually has a slightly
 *greater* chance of going down than a non-HA guest on a non-HA host.
 (And let's not get into don't use fencing then; we all know why
 that's a bad idea.)

 Which is why I think it makes sense to just distinguish between
 HA-capable and non-HA-capable hosts, and have the user decide whether
 they want HA or non-HA guests simply by assigning them to the
 appropriate host aggregates.

 Very good point.  I hadn't considered that.

Yay, I've contributed something useful to this discussion then. :)

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
michal.jastrzeb...@intel.com wrote:
 In my opinion flavor defining is a bit hacky. Sure, it will provide us
 functionality fairly quickly, but also will strip us from flexibility Heat
 would give. Healing can be done in several ways, simple destroy - create
 (basic convergence workflow so far), evacuate with or without
 shared storage, even rebuild vm, probably few more when we put more thoughts
 to it.

But then you'd also need to monitor the availability of *individual*
guest and down you go the rabbit hole.

So suppose you're monitoring a guest with a simple ping. And it stops
responding to that ping.

(1) Has it died?
(2) Is it just too busy to respond to the ping?
(3) Has its guest network stack died?
(4) Has its host vif died?
(5) Has the L2 agent on the compute host died?
(6) Has its host network stack died?
(7) Has the compute host died?

Suppose further it's using shared storage (running off an RBD volume
or using an iSCSI volume, or whatever). Now you have almost as many
recovery options as possible causes for the failure, and some of those
recovery options will potentially destroy your guest's data.

No matter how you twist and turn the problem, you need strongly
consistent distributed VM state plus fencing. In other words, you need
a full blown HA stack.

 I'd rather use nova for low level task and maybe low level monitoring (imho
 nova should do that using servicegroup). But I'd use something more more
 configurable for actual task triggering like heat. That would give us
 framework rather than mechanism. Later we might want to apply HA on network or
 volume, then we'll have mechanism ready just monitoring hook and healing
 will need to be implemented.

 We can use scheduler hints to place resource on host HA-compatible
 (whichever health action we'd like to use), this will bit more complicated, 
 but
 also will give us more flexibility.

I apologize in advance for my bluntness, but this all sounds to me
like you're vastly underrating the problem of reliable guest state
detection and recovery. :)

 I agree that we all should meet in Paris and discuss that so we can join our
 forces. This is one of bigger gaps to be filled imho.

Pretty much every user I've worked with in the last 2 years agrees.
Granted, my view may be skewed as HA is typically what customers
approach us for in the first place, but yes, this definitely needs a
globally understood and supported solution.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Thomas Herve

  This still doesn't do away with the requirement to reliably detect
  node failure, and to fence misbehaving nodes. Detecting that a node
  has failed, and fencing it if unsure, is a prerequisite for any
  recovery action. So you need Corosync/Pacemaker anyway.
 
  Obviously, yes.  My post covered all of that directly ... the tagging
  bit was just additional input into the recovery operation.
 
 This is essentially why I am saying using the Pacemaker stack is the
 smarter approach than hacking something into Ceilometer and Heat. You
 already need Pacemaker for service availability (and all major vendors
 have adopted it for that purpose), so a highly available cloud that
 does *not* use Pacemaker at all won't be a vendor supported option for
 some time. So people will already be running Pacemaker — then why not
 use it for what it's good at?

I may be missing something, but Pacemaker will only provide monitoring of your 
compute node, right? I think the advantage you would get by using something 
like Heat is having an instance agent and provide monitoring of your client 
service, instead of just knowing the status of your hypervisor. Hosts can fail, 
but there is another array of failures that you can't handle with the global 
deployment monitoring.

-- 
Thomas

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 11:01 AM, Thomas Herve
thomas.he...@enovance.com wrote:

  This still doesn't do away with the requirement to reliably detect
  node failure, and to fence misbehaving nodes. Detecting that a node
  has failed, and fencing it if unsure, is a prerequisite for any
  recovery action. So you need Corosync/Pacemaker anyway.
 
  Obviously, yes.  My post covered all of that directly ... the tagging
  bit was just additional input into the recovery operation.

 This is essentially why I am saying using the Pacemaker stack is the
 smarter approach than hacking something into Ceilometer and Heat. You
 already need Pacemaker for service availability (and all major vendors
 have adopted it for that purpose), so a highly available cloud that
 does *not* use Pacemaker at all won't be a vendor supported option for
 some time. So people will already be running Pacemaker — then why not
 use it for what it's good at?

 I may be missing something, but Pacemaker will only provide monitoring of 
 your compute node, right? I think the advantage you would get by using 
 something like Heat is having an instance agent and provide monitoring of 
 your client service, instead of just knowing the status of your hypervisor. 
 Hosts can fail, but there is another array of failures that you can't handle 
 with the global deployment monitoring.

You *are* missing something, indeed. :) Pacemaker would be a perfectly
fine tool for also monitoring the status of your guests on the hosts.
So arguably, nova-compute could in fact hook in with pcsd
(https://github.com/feist/pcs/tree/master/pcs -- all in Python) down
the road to inject VM monitoring into the Pacemaker configuration.
This would, of course, need to be specific to the hypervisor so it
would be a job for the nova driver, rather than being implemented at
the nova-compute level.

But my hunch is that that sort of thing would be for the L release;
for Kilo the low-hanging fruit would be to defend against host failure
(meaning, compute node failure, unrecoverable nova-compute service
failure, etc.).

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Russell Bryant
On 10/16/2014 04:29 AM, Florian Haas wrote:
 (5) Let monitoring and orchestration services deal with these use
 cases and
 have Nova simply provide the primitive API calls that it already does
 (i.e.
 host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.

 How so? (5) would use existing wheels for monitoring and orchestration
 instead of writing all new code paths inside Nova to do the same thing.

 Right, there may be some confusion here ... I thought you were both
 agreeing that the use of an external toolset was a good approach for the
 problem, but Florian's last message makes that not so clear ...
 
 While one of us (Jay or me) speaking for the other and saying we agree
 is a distributed consensus problem that dwarfs the complexity of
 Paxos, *I* for my part do think that an external toolset (i.e. one
 that lives outside the Nova codebase) is the better approach versus
 duplicating the functionality of said toolset in Nova.
 
 I just believe that the toolset that should be used here is
 Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
 approach leads to *much* fewer necessary code changes *in* Nova than
 the latter.

Have you tried pacemaker_remote yet?  It seems like a better choice for
this particular case, as opposed to using corosync, due to the potential
number of compute nodes.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Russell Bryant
On 10/16/2014 05:01 AM, Thomas Herve wrote:
 
 This still doesn't do away with the requirement to reliably detect
 node failure, and to fence misbehaving nodes. Detecting that a node
 has failed, and fencing it if unsure, is a prerequisite for any
 recovery action. So you need Corosync/Pacemaker anyway.

 Obviously, yes.  My post covered all of that directly ... the tagging
 bit was just additional input into the recovery operation.

 This is essentially why I am saying using the Pacemaker stack is the
 smarter approach than hacking something into Ceilometer and Heat. You
 already need Pacemaker for service availability (and all major vendors
 have adopted it for that purpose), so a highly available cloud that
 does *not* use Pacemaker at all won't be a vendor supported option for
 some time. So people will already be running Pacemaker — then why not
 use it for what it's good at?
 
 I may be missing something, but Pacemaker will only provide
 monitoring of your compute node, right? I think the advantage you
 would get by using something like Heat is having an instance agent
 and provide monitoring of your client service, instead of just
 knowing the status of your hypervisor. Hosts can fail, but there is
 another array of failures that you can't handle with the global
 deployment monitoring.

I think that's an important problem, too.

The thread was started talking about evacuate, which is used in the case
of a host failure.  I wrote up a more detailed proposal of using an
external tool (Pacemaker) to handle automatic evacuation of failed hosts.

For a guest OS failure, we have some basic watchdog support.  From my
blog post:

It’s worth noting that the libvirt/KVM driver in OpenStack does contain
one feature related to guest operating system failure.  The
libvirt-watchdog blueprint was implemented in the Icehouse release of
Nova.  This feature allows you to set the hw_watchdog_action property on
either the image or flavor.  Valid values include poweroff, reset,
pause, and none.  When this is enabled, libvirt will enable the i6300esb
watchdog device for the guest and will perform the requested action if
the watchdog is triggered.  This may be a helpful component of your
strategy for recovery from guest failures.

HA in the case of application failures can be handled in several ways,
depending on the application.  It's really a separate problem space,
though, IMO.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 1:59 PM, Russell Bryant rbry...@redhat.com wrote:
 On 10/16/2014 04:29 AM, Florian Haas wrote:
 (5) Let monitoring and orchestration services deal with these use
 cases and
 have Nova simply provide the primitive API calls that it already does
 (i.e.
 host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.

 How so? (5) would use existing wheels for monitoring and orchestration
 instead of writing all new code paths inside Nova to do the same thing.

 Right, there may be some confusion here ... I thought you were both
 agreeing that the use of an external toolset was a good approach for the
 problem, but Florian's last message makes that not so clear ...

 While one of us (Jay or me) speaking for the other and saying we agree
 is a distributed consensus problem that dwarfs the complexity of
 Paxos, *I* for my part do think that an external toolset (i.e. one
 that lives outside the Nova codebase) is the better approach versus
 duplicating the functionality of said toolset in Nova.

 I just believe that the toolset that should be used here is
 Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
 approach leads to *much* fewer necessary code changes *in* Nova than
 the latter.

 Have you tried pacemaker_remote yet?  It seems like a better choice for
 this particular case, as opposed to using corosync, due to the potential
 number of compute nodes.

I'll assume that you are *not* referring to running Corosync/Pacemaker
on the compute nodes plus pacemaker_remote in the VMs, because doing
so would blow up the separation between the cloud operator and tenant
space.

Running compute nodes as baremetal extensions of a different
Corosync/Pacemaker cluster (presumably the one that manages the other
Nova services)  would potentially be an option, although vendors would
need to buy into this. Ubuntu, for example, currently only ships
pacemaker-remote in universe.

*If* you're running pacemaker_remote on the compute node, though, that
then also opens up the possibility for a compute driver to just dump
the libvirt definition into a VirtualDomain Pacemaker resource,
meaning with a small callout added to Nova, you could also get the
virtual machine monitoring functionality. Bonus: this could eventually
be extended to allow live migration of guests to other compute nodes
in the same cluster, in case you want to shut down a compute node for
maintenance without interrupting your HA guests.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Steve Gordon


- Original Message -
 From: Florian Haas flor...@hastexo.com
 To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 
 On Thu, Oct 16, 2014 at 1:59 PM, Russell Bryant rbry...@redhat.com wrote:
  On 10/16/2014 04:29 AM, Florian Haas wrote:
  (5) Let monitoring and orchestration services deal with these use
  cases and
  have Nova simply provide the primitive API calls that it already does
  (i.e.
  host evacuate).
 
  That would arguably lead to an incredible amount of wheel reinvention
  for node failure detection, service failure detection, etc. etc.
 
  How so? (5) would use existing wheels for monitoring and orchestration
  instead of writing all new code paths inside Nova to do the same thing.
 
  Right, there may be some confusion here ... I thought you were both
  agreeing that the use of an external toolset was a good approach for the
  problem, but Florian's last message makes that not so clear ...
 
  While one of us (Jay or me) speaking for the other and saying we agree
  is a distributed consensus problem that dwarfs the complexity of
  Paxos, *I* for my part do think that an external toolset (i.e. one
  that lives outside the Nova codebase) is the better approach versus
  duplicating the functionality of said toolset in Nova.
 
  I just believe that the toolset that should be used here is
  Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
  approach leads to *much* fewer necessary code changes *in* Nova than
  the latter.
 
  Have you tried pacemaker_remote yet?  It seems like a better choice for
  this particular case, as opposed to using corosync, due to the potential
  number of compute nodes.
 
 I'll assume that you are *not* referring to running Corosync/Pacemaker
 on the compute nodes plus pacemaker_remote in the VMs, because doing
 so would blow up the separation between the cloud operator and tenant
 space.
 
 Running compute nodes as baremetal extensions of a different
 Corosync/Pacemaker cluster (presumably the one that manages the other
 Nova services)  would potentially be an option, although vendors would
 need to buy into this. Ubuntu, for example, currently only ships
 pacemaker-remote in universe.

This is something we'd be doing *too* OpenStack rather than *in* the OpenStack 
projects (at least those that deliver code), in fact that's a large part of the 
appeal. As such I don't know that there necessarily has to be one true solution 
to rule them all, a distribution could deviate as needed, but we would have 
some - ideally very small - number of known good configurations which achieve 
the stated goal and are well documented.

Thanks,

Steve

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 4:31 PM, Steve Gordon sgor...@redhat.com wrote:
 Running compute nodes as baremetal extensions of a different
 Corosync/Pacemaker cluster (presumably the one that manages the other
 Nova services)  would potentially be an option, although vendors would
 need to buy into this. Ubuntu, for example, currently only ships
 pacemaker-remote in universe.

 This is something we'd be doing *too* OpenStack rather than *in* the 
 OpenStack projects (at least those that deliver code), in fact that's a large 
 part of the appeal. As such I don't know that there necessarily has to be one 
 true solution to rule them all, a distribution could deviate as needed, but 
 we would have some - ideally very small - number of known good 
 configurations which achieve the stated goal and are well documented.

Correct. In the infrastructure/service HA field, we already have that,
as vendors (with very few exceptions) have settled on
Corosync/Pacemaker for service availability, HAproxy for load
balancing, and MySQL/Galera for database replication, for example. It
would be great if we could see this kind of convergent evolution for
guest HA as well.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Russell Bryant
On 10/16/2014 09:00 AM, Florian Haas wrote:
 On Thu, Oct 16, 2014 at 1:59 PM, Russell Bryant rbry...@redhat.com wrote:
 On 10/16/2014 04:29 AM, Florian Haas wrote:
 (5) Let monitoring and orchestration services deal with these use
 cases and
 have Nova simply provide the primitive API calls that it already does
 (i.e.
 host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.

 How so? (5) would use existing wheels for monitoring and orchestration
 instead of writing all new code paths inside Nova to do the same thing.

 Right, there may be some confusion here ... I thought you were both
 agreeing that the use of an external toolset was a good approach for the
 problem, but Florian's last message makes that not so clear ...

 While one of us (Jay or me) speaking for the other and saying we agree
 is a distributed consensus problem that dwarfs the complexity of
 Paxos, *I* for my part do think that an external toolset (i.e. one
 that lives outside the Nova codebase) is the better approach versus
 duplicating the functionality of said toolset in Nova.

 I just believe that the toolset that should be used here is
 Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
 approach leads to *much* fewer necessary code changes *in* Nova than
 the latter.

 Have you tried pacemaker_remote yet?  It seems like a better choice for
 this particular case, as opposed to using corosync, due to the potential
 number of compute nodes.
 
 I'll assume that you are *not* referring to running Corosync/Pacemaker
 on the compute nodes plus pacemaker_remote in the VMs, because doing
 so would blow up the separation between the cloud operator and tenant
 space.

Correct.

 Running compute nodes as baremetal extensions of a different
 Corosync/Pacemaker cluster (presumably the one that manages the other
 Nova services)  would potentially be an option, although vendors would
 need to buy into this. Ubuntu, for example, currently only ships
 pacemaker-remote in universe.

Yes, this is what I had in mind.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Adam Lawson
Be forewarned; here's my two cents before I've had my morning coffee.

It would seem to me that if we were seeking some level of resiliency
against host failures (if a host fails, evacuate the instances that were
hosted on it to a host that isn't broken), it would seem that host HA is a
good approach. The ultimate goal of course is instance HA but the task of
monitoring individual instances and determining what constitutes down
seems like a much more complex task than detecting when a compute node is
down. I know that requiring the presence of agents should probably need
some more brain-cycles since we can't expect additional bytes consuming
memory on each individual VM.

Additionally, I'm not really hung up on the 'how' as we all realize there
several ways to skin that cat, so long as that 'how' is leveraged via tools
over which we have control and direct influence. Reason being, we may not
want to leverage features as important as this on tools that change outside
our control and subsequently shifts the foundation of the feature we
implemented that was based on how the product USED to work. Basically if
Pacemaker does what we need then cool but it seems that implementing a
feature should be built upon a bedrock of programs over which we have a
direct influence. This is why Nagios may be able to do it but it's a hack
at best. I'm not saying Nagios isn't good or ythe hack doesn't work but in
the context of an Openstack solution, we can't require a single external
tool for a feature like host or VM HA. Are we suggesting that we tell
people who want HA - go use Nagios? Call me a purist but if we're going
to implement a feature, it should be our community implementing it because
we have some of the best minds on staff. ; )



*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Thu, Oct 16, 2014 at 7:53 AM, Russell Bryant rbry...@redhat.com wrote:

 On 10/16/2014 09:00 AM, Florian Haas wrote:
  On Thu, Oct 16, 2014 at 1:59 PM, Russell Bryant rbry...@redhat.com
 wrote:
  On 10/16/2014 04:29 AM, Florian Haas wrote:
  (5) Let monitoring and orchestration services deal with these use
  cases and
  have Nova simply provide the primitive API calls that it already
 does
  (i.e.
  host evacuate).
 
  That would arguably lead to an incredible amount of wheel
 reinvention
  for node failure detection, service failure detection, etc. etc.
 
  How so? (5) would use existing wheels for monitoring and
 orchestration
  instead of writing all new code paths inside Nova to do the same
 thing.
 
  Right, there may be some confusion here ... I thought you were both
  agreeing that the use of an external toolset was a good approach for
 the
  problem, but Florian's last message makes that not so clear ...
 
  While one of us (Jay or me) speaking for the other and saying we agree
  is a distributed consensus problem that dwarfs the complexity of
  Paxos, *I* for my part do think that an external toolset (i.e. one
  that lives outside the Nova codebase) is the better approach versus
  duplicating the functionality of said toolset in Nova.
 
  I just believe that the toolset that should be used here is
  Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
  approach leads to *much* fewer necessary code changes *in* Nova than
  the latter.
 
  Have you tried pacemaker_remote yet?  It seems like a better choice for
  this particular case, as opposed to using corosync, due to the potential
  number of compute nodes.
 
  I'll assume that you are *not* referring to running Corosync/Pacemaker
  on the compute nodes plus pacemaker_remote in the VMs, because doing
  so would blow up the separation between the cloud operator and tenant
  space.

 Correct.

  Running compute nodes as baremetal extensions of a different
  Corosync/Pacemaker cluster (presumably the one that manages the other
  Nova services)  would potentially be an option, although vendors would
  need to buy into this. Ubuntu, for example, currently only ships
  pacemaker-remote in universe.

 Yes, this is what I had in mind.

 --
 Russell Bryant

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Russell Bryant
On 10/16/2014 01:03 PM, Adam Lawson wrote:
 Be forewarned; here's my two cents before I've had my morning coffee. 
 
 It would seem to me that if we were seeking some level of resiliency
 against host failures (if a host fails, evacuate the instances that were
 hosted on it to a host that isn't broken), it would seem that host HA is
 a good approach. The ultimate goal of course is instance HA but the task
 of monitoring individual instances and determining what constitutes
 down seems like a much more complex task than detecting when a compute
 node is down. I know that requiring the presence of agents should
 probably need some more brain-cycles since we can't expect additional
 bytes consuming memory on each individual VM.
 
 Additionally, I'm not really hung up on the 'how' as we all realize
 there several ways to skin that cat, so long as that 'how' is leveraged
 via tools over which we have control and direct influence. Reason being,
 we may not want to leverage features as important as this on tools that
 change outside our control and subsequently shifts the foundation of the
 feature we implemented that was based on how the product USED to work.
 Basically if Pacemaker does what we need then cool but it seems that
 implementing a feature should be built upon a bedrock of programs over
 which we have a direct influence. This is why Nagios may be able to do
 it but it's a hack at best. I'm not saying Nagios isn't good or ythe
 hack doesn't work but in the context of an Openstack solution, we can't
 require a single external tool for a feature like host or VM HA. Are we
 suggesting that we tell people who want HA - go use Nagios? Call me a
 purist but if we're going to implement a feature, it should be our
 community implementing it because we have some of the best minds on
 staff. ; )

I think you just gave a great example of NIH.  :-)

I was saying use Pacemaker, not use Nagios.  I'm not aware of
fencing integration with Nagios, but it's feasible.  The key point I've
been making is this is very achievable today as a function of the
infrastructure supporting an OpenStack deployment.  I'd also like to
work on some more detailed examples of doing so.

FWIW, there are existing very good relationships between OpenStack
community members and the Pacemaker team.  I'm really not concerned
about that at all.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 7:03 PM, Adam Lawson alaw...@aqorn.com wrote:

 Be forewarned; here's my two cents before I've had my morning coffee.

 It would seem to me that if we were seeking some level of resiliency against 
 host failures (if a host fails, evacuate the instances that were hosted on it 
 to a host that isn't broken), it would seem that host HA is a good approach. 
 The ultimate goal of course is instance HA but the task of monitoring 
 individual instances and determining what constitutes down seems like a 
 much more complex task than detecting when a compute node is down. I know 
 that requiring the presence of agents should probably need some more 
 brain-cycles since we can't expect additional bytes consuming memory on each 
 individual VM.

What Russell is suggesting, though, is actually a very feasible
approach for compute node HA today and per-instance HA tomorrow.

 Additionally, I'm not really hung up on the 'how' as we all realize there 
 several ways to skin that cat, so long as that 'how' is leveraged via tools 
 over which we have control and direct influence. Reason being, we may not 
 want to leverage features as important as this on tools that change outside 
 our control and subsequently shifts the foundation of the feature we 
 implemented that was based on how the product USED to work. Basically if 
 Pacemaker does what we need then cool but it seems that implementing a 
 feature should be built upon a bedrock of programs over which we have a 
 direct influence.

That almost sounds a bit like let's always build a better wheel,
because control. I'm not sure if that's indeed the intention, but if
it is then that seems like a bad idea to me.

 This is why Nagios may be able to do it but it's a hack at best. I'm not 
 saying Nagios isn't good or ythe hack doesn't work but in the context of an 
 Openstack solution, we can't require a single external tool for a feature 
 like host or VM HA. Are we suggesting that we tell people who want HA - go 
 use Nagios? Call me a purist but if we're going to implement a feature, it 
 should be our community implementing it because we have some of the best 
 minds on staff. ; )

Anyone who thinks that having a monitoring solution to page people and
then waking up a human to restart the service constitutes HA needs to
be doused in a bucket of ice water. :)

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 7:48 PM, Jay Pipes jaypi...@gmail.com wrote:
 While one of us (Jay or me) speaking for the other and saying we agree
 is a distributed consensus problem that dwarfs the complexity of
 Paxos


 You've always had a way with words, Florian :)

I knew you'd like that one. :)

, *I* for my part do think that an external toolset (i.e. one

 that lives outside the Nova codebase) is the better approach versus
 duplicating the functionality of said toolset in Nova.

 I just believe that the toolset that should be used here is
 Corosync/Pacemaker and not Ceilometer/Heat. And I believe the former
 approach leads to *much* fewer necessary code changes *in* Nova than
 the latter.


 I agree with you that Corosync/Pacemaker is the tool of choice for
 monitoring/heartbeat functionality, and is my choice for compute-node-level
 HA monitoring. For guest-level HA monitoring, I would say use
 Heat/Ceilometer. For container-level HA monitoring, it looks like fleet or
 something like Kubernetes would be a good option.

Here's why I think that's a bad idea: none of these support the
concept of being subordinate to another cluster.

Again, suppose a VM stops responding. Then
Heat/Ceilometer/Kubernetes/fleet would need to know whether the node
hosting the VM is down or not. Only if the node is up or recovered
(which Pacemaker would be reponsible for) the VM HA facility would be
able to kick in. Effectively you have two views of the cluster
membership, and that sort of thing always gets messy. In the HA space
we're always facing the same issues when a replication facility
(Galera, GlusterFS, DRBD, whatever) has a different view of the
cluster membership than the cluster manager itself — which *always*
happens for a few seconds on any failover, recovery, or fencing event.

Russell's suggestion, by having remote Pacemaker instances on the
compute nodes tie in with a Pacemaker cluster on the control nodes,
does away with that discrepancy.

 I'm curious to see how the combination of compute-node-level HA and
 container-level HA tools will work together in some of the proposed
 deployment architectures (bare metal + docker containers w/ OpenStack and
 infrastructure services run in a Kubernetes pod or CoreOS fleet).

I have absolutely nothing against an OpenStack cluster using
*exclusively* Kubernetes or fleet for HA management, once those have
reached sufficient maturity. But just about every significant
OpenStack distro out there has settled on Corosync/Pacemaker for the
time being. Let's not shove another cluster manager down their throats
for little to no real benefit.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Adam Lawson
Okay the coffee kicked in.

I can see how my comment could be interpreted that way so let's take a step
backward so I can explain my perspective here.

Amazon was the first to implement a commercial-grade cloud IaaS, Openstack
was developed as an alternative. If we avoided wheel re-invention as a
rule, Openstack would have never been written. That's how I see it.
Automatic fail-over is already done by VMware. If we were looking to avoid
re-invention as our guide to implementing new features, we'd setup a
product referral partnership with VMware, tell our users that HA requires
VMware, dust off our hands and say job well done. No one here is saying
that though, but that's the mindset I think I'm hearing. I champion the
in-house approach not as an effort to develop something that doesn't exist
elsewhere or for the sake of control but because we don't want to be tied
to a single external product for a core feature of Openstack.

When ProductA+ProductB = XYZ, it creates a one-way dependency that I
historically try to avoid. Because if ProductA = Openstack, ProductB is no
longer optional.

Personally speaking, I'm actually speaking more towards our approach to how
we scope features for Openstack rather than whether we use Pacemaker,
Nagios, Nova, Heat or something else.

Question: is host HA not achievable using the programs we have in place now
(with modification of course)? If not, I'm still a champion to see it done
within our four walls.

Just my 10c or so. ; )



*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Thu, Oct 16, 2014 at 10:53 AM, Florian Haas flor...@hastexo.com wrote:

 On Thu, Oct 16, 2014 at 7:03 PM, Adam Lawson alaw...@aqorn.com wrote:
 
  Be forewarned; here's my two cents before I've had my morning coffee.
 
  It would seem to me that if we were seeking some level of resiliency
 against host failures (if a host fails, evacuate the instances that were
 hosted on it to a host that isn't broken), it would seem that host HA is a
 good approach. The ultimate goal of course is instance HA but the task of
 monitoring individual instances and determining what constitutes down
 seems like a much more complex task than detecting when a compute node is
 down. I know that requiring the presence of agents should probably need
 some more brain-cycles since we can't expect additional bytes consuming
 memory on each individual VM.

 What Russell is suggesting, though, is actually a very feasible
 approach for compute node HA today and per-instance HA tomorrow.

  Additionally, I'm not really hung up on the 'how' as we all realize
 there several ways to skin that cat, so long as that 'how' is leveraged via
 tools over which we have control and direct influence. Reason being, we may
 not want to leverage features as important as this on tools that change
 outside our control and subsequently shifts the foundation of the feature
 we implemented that was based on how the product USED to work. Basically if
 Pacemaker does what we need then cool but it seems that implementing a
 feature should be built upon a bedrock of programs over which we have a
 direct influence.

 That almost sounds a bit like let's always build a better wheel,
 because control. I'm not sure if that's indeed the intention, but if
 it is then that seems like a bad idea to me.

  This is why Nagios may be able to do it but it's a hack at best. I'm not
 saying Nagios isn't good or ythe hack doesn't work but in the context of an
 Openstack solution, we can't require a single external tool for a feature
 like host or VM HA. Are we suggesting that we tell people who want HA - go
 use Nagios? Call me a purist but if we're going to implement a feature, it
 should be our community implementing it because we have some of the best
 minds on staff. ; )

 Anyone who thinks that having a monitoring solution to page people and
 then waking up a human to restart the service constitutes HA needs to
 be doused in a bucket of ice water. :)

 Cheers,
 Florian

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Russell Bryant
On 10/16/2014 02:40 PM, Adam Lawson wrote:
 Question: is host HA not achievable using the programs we have in place
 now (with modification of course)? If not, I'm still a champion to see
 it done within our four walls.

Yes, it is achievable (without modification, even).

That was the primary point of:

  http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/

I think there's work to do to build up a reference configuration, test
it out, and document it.  I believe all the required software exists and
is already in use in many OpenStack deployments for other reasons.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-16 Thread Florian Haas
On Thu, Oct 16, 2014 at 9:40 PM, Russell Bryant rbry...@redhat.com wrote:
 On 10/16/2014 02:40 PM, Adam Lawson wrote:
 Question: is host HA not achievable using the programs we have in place
 now (with modification of course)? If not, I'm still a champion to see
 it done within our four walls.

 Yes, it is achievable (without modification, even).

 That was the primary point of:

   http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/

 I think there's work to do to build up a reference configuration, test
 it out, and document it.  I believe all the required software exists and
 is already in use in many OpenStack deployments for other reasons.

+1.

Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Jastrzebski, Michal
I tend to agree that that shouldn't be placed in nova. As it happens I'm 
working on very same thing (hello Russell :)). My current candidate is heat. 
Convergence will be in my opinion great place to do it 
(https://review.openstack.org/#/c/95907/). It's still in state of planning, but 
we'll talk about that more in Paris. I even have working demo of automatic 
evacuation :) (come to intel booth in Paris if you'd like to see it).

Thing is, nova currently isn't ready for that. For example:  
https://bugs.launchpad.net/nova/+bug/1379292
We are working on bp to enable nova to check actual host health, not only nova 
services health (bp coming soon, but in short its enabling zookeeper 
servicegroup api to monitor for example libvirt, or something else which, if 
down, means vms are dead).
That won't replace actual fencing, but that's something, and even if we would 
like to have fencing in nova, that's an requirement.

Maybe it's worth a design session? I've seen this or similar idea in several 
places already, and demand is strong for that.

Regards,
Michał

 -Original Message-
 From: Russell Bryant [mailto:rbry...@redhat.com]
 Sent: Tuesday, October 14, 2014 8:55 PM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On 10/14/2014 01:01 PM, Jay Pipes wrote:
  2) Looking forward, there is a lot of demand for doing this on a per
  instance basis.  We should decide on a best practice for allowing end
  users to indicate whether they would like their VMs automatically
  rescued by the infrastructure, or just left down in the case of a
  failure.  It could be as simple as a special tag set on an instance [2].
 
  Please note that server instance tagging (thanks for the shout-out,
  BTW) is intended for only user-defined tags, not system-defined
  metadata which is what this sounds like...
 
 I was envisioning the tag being set by the end user to say please keep my
 VM running until I say otherwise, or something like auto-recover
 for short.
 
 So, it's specified by the end user, but potentially acted upon by the system
 (as you say below).
 
  Of course, one might implement some external polling/monitoring system
  using server instance tags, which might do a nova list --tag $TAG
  --host $FAILING_HOST, and initiate a migrate for each returned server
 instance...
 
 Yeah, that's what I was thinking.  Whatever system you use to react to a
 failing host could use the tag as part of the criteria to figure out which
 instances to evacuate and which to leave as dead.
 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Russell Bryant
On 10/13/2014 05:59 PM, Russell Bryant wrote:
 Nice timing.  I was working on a blog post on this topic.

which is now here:

http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Florian Haas
On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com wrote:
 On 10/13/2014 05:59 PM, Russell Bryant wrote:
 Nice timing.  I was working on a blog post on this topic.

 which is now here:

 http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/

I am absolutely loving the fact that we are finally having a
discussion in earnest about this. i think this deserves a Design
Summit session.

If I may weigh in here, let me share what I've seen users do and what
can currently be done, and what may be supported in the future.

Problem: automatically ensure that a Nova guest continues to run, even
if its host fails.

(That's the general problem description and I don't need to go into
further details explaining the problem, because Russell has done that
beautifully in his blog post.)

Now, what are the options?

(1) Punt and leave it to the hypervisor.

This essentially means that you must use a hypervisor that already has
HA built in, such as VMware with the VCenter driver. In that scenario,
Nova itself neither deals with HA, nor exposes any HA switches to the
user. Obvious downside: not generic, doesn't work with all
hypervisors, most importantly doesn't work with the most popular one
(libvirt/KVM).

(2) Deploy Nova nodes in pairs/groups, and pretend that they are one node.

You can already do that by overriding host in nova-compute.conf,
setting resume_guests_state_on_host_boot, and using VIPs with
Corosync/Pacemaker. You can then group these hosts in host aggregates,
and the user's scheduler hint to point a newly scheduled guest to such
a host aggregate becomes, effectively, the keep this guest running at
all times flag. Upside: no changes to Nova at all, monitoring,
fencing and recovery for free from Corosync/Pacemaker. Downsides:
requires vendors to automate Pacemaker configuration in deployment
tools (because you really don't want to do those things manually).
Additional downside: you either have some idle hardware, or you might
be overcommitting resources in case of failover.

(3) Automatic host evacuation.

Not supported in Nova right now, as Adam pointed out at the top of the
thread, and repeatedly shot down. If someone were to implement this,
it would *still* require that Corosync/Pacemaker be used for
monitoring and fencing of nodes, because re-implementing this from
scratch would be the reinvention of a wheel while painting a bikeshed.

(4) Per-guest HA.

This is the idea of just doing nova boot --keep-this running, i.e.
setting a per-guest flag that still means the machine is to be kept up
at all times. Again, not supported in Nova right now, and probably
even more complex to implement generically than (3), at the same or
greater cost.

I have a suggestion to tackle this that I *think* is reasonably
user-friendly while still bearable in terms of Nova development
effort:

(a) Define a well-known metadata key for a host aggregate, say ha.
Define that any host aggregate that represents a highly available
group of compute nodes should have this metadata key set.

(b) Then define a flavor that sets extra_specs ha=true.

Granted, this places an additional burden on distro vendors to
integrate highly-available compute nodes into their deployment
infrastructure. But since practically all of them already include
Pacemaker, the additional scaffolding required is actually rather
limited.

Am I making sense?

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Jay Pipes

On 10/15/2014 03:16 PM, Florian Haas wrote:

On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com wrote:

On 10/13/2014 05:59 PM, Russell Bryant wrote:

Nice timing.  I was working on a blog post on this topic.


which is now here:

http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/


I am absolutely loving the fact that we are finally having a
discussion in earnest about this. i think this deserves a Design
Summit session.

If I may weigh in here, let me share what I've seen users do and what
can currently be done, and what may be supported in the future.

Problem: automatically ensure that a Nova guest continues to run, even
if its host fails.

(That's the general problem description and I don't need to go into
further details explaining the problem, because Russell has done that
beautifully in his blog post.)

Now, what are the options?

(1) Punt and leave it to the hypervisor.

This essentially means that you must use a hypervisor that already has
HA built in, such as VMware with the VCenter driver. In that scenario,
Nova itself neither deals with HA, nor exposes any HA switches to the
user. Obvious downside: not generic, doesn't work with all
hypervisors, most importantly doesn't work with the most popular one
(libvirt/KVM).

(2) Deploy Nova nodes in pairs/groups, and pretend that they are one node.

You can already do that by overriding host in nova-compute.conf,
setting resume_guests_state_on_host_boot, and using VIPs with
Corosync/Pacemaker. You can then group these hosts in host aggregates,
and the user's scheduler hint to point a newly scheduled guest to such
a host aggregate becomes, effectively, the keep this guest running at
all times flag. Upside: no changes to Nova at all, monitoring,
fencing and recovery for free from Corosync/Pacemaker. Downsides:
requires vendors to automate Pacemaker configuration in deployment
tools (because you really don't want to do those things manually).
Additional downside: you either have some idle hardware, or you might
be overcommitting resources in case of failover.

(3) Automatic host evacuation.

Not supported in Nova right now, as Adam pointed out at the top of the
thread, and repeatedly shot down. If someone were to implement this,
it would *still* require that Corosync/Pacemaker be used for
monitoring and fencing of nodes, because re-implementing this from
scratch would be the reinvention of a wheel while painting a bikeshed.

(4) Per-guest HA.

This is the idea of just doing nova boot --keep-this running, i.e.
setting a per-guest flag that still means the machine is to be kept up
at all times. Again, not supported in Nova right now, and probably
even more complex to implement generically than (3), at the same or
greater cost.

I have a suggestion to tackle this that I *think* is reasonably
user-friendly while still bearable in terms of Nova development
effort:

(a) Define a well-known metadata key for a host aggregate, say ha.
Define that any host aggregate that represents a highly available
group of compute nodes should have this metadata key set.

(b) Then define a flavor that sets extra_specs ha=true.

Granted, this places an additional burden on distro vendors to
integrate highly-available compute nodes into their deployment
infrastructure. But since practically all of them already include
Pacemaker, the additional scaffolding required is actually rather
limited.


Or:

(5) Let monitoring and orchestration services deal with these use cases 
and have Nova simply provide the primitive API calls that it already 
does (i.e. host evacuate).


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Florian Haas
On Wed, Oct 15, 2014 at 9:58 PM, Jay Pipes jaypi...@gmail.com wrote:
 On 10/15/2014 03:16 PM, Florian Haas wrote:

 On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com
 wrote:

 On 10/13/2014 05:59 PM, Russell Bryant wrote:

 Nice timing.  I was working on a blog post on this topic.


 which is now here:

 http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/


 I am absolutely loving the fact that we are finally having a
 discussion in earnest about this. i think this deserves a Design
 Summit session.

 If I may weigh in here, let me share what I've seen users do and what
 can currently be done, and what may be supported in the future.

 Problem: automatically ensure that a Nova guest continues to run, even
 if its host fails.

 (That's the general problem description and I don't need to go into
 further details explaining the problem, because Russell has done that
 beautifully in his blog post.)

 Now, what are the options?

 (1) Punt and leave it to the hypervisor.

 This essentially means that you must use a hypervisor that already has
 HA built in, such as VMware with the VCenter driver. In that scenario,
 Nova itself neither deals with HA, nor exposes any HA switches to the
 user. Obvious downside: not generic, doesn't work with all
 hypervisors, most importantly doesn't work with the most popular one
 (libvirt/KVM).

 (2) Deploy Nova nodes in pairs/groups, and pretend that they are one node.

 You can already do that by overriding host in nova-compute.conf,
 setting resume_guests_state_on_host_boot, and using VIPs with
 Corosync/Pacemaker. You can then group these hosts in host aggregates,
 and the user's scheduler hint to point a newly scheduled guest to such
 a host aggregate becomes, effectively, the keep this guest running at
 all times flag. Upside: no changes to Nova at all, monitoring,
 fencing and recovery for free from Corosync/Pacemaker. Downsides:
 requires vendors to automate Pacemaker configuration in deployment
 tools (because you really don't want to do those things manually).
 Additional downside: you either have some idle hardware, or you might
 be overcommitting resources in case of failover.

 (3) Automatic host evacuation.

 Not supported in Nova right now, as Adam pointed out at the top of the
 thread, and repeatedly shot down. If someone were to implement this,
 it would *still* require that Corosync/Pacemaker be used for
 monitoring and fencing of nodes, because re-implementing this from
 scratch would be the reinvention of a wheel while painting a bikeshed.

 (4) Per-guest HA.

 This is the idea of just doing nova boot --keep-this running, i.e.
 setting a per-guest flag that still means the machine is to be kept up
 at all times. Again, not supported in Nova right now, and probably
 even more complex to implement generically than (3), at the same or
 greater cost.

 I have a suggestion to tackle this that I *think* is reasonably
 user-friendly while still bearable in terms of Nova development
 effort:

 (a) Define a well-known metadata key for a host aggregate, say ha.
 Define that any host aggregate that represents a highly available
 group of compute nodes should have this metadata key set.

 (b) Then define a flavor that sets extra_specs ha=true.

 Granted, this places an additional burden on distro vendors to
 integrate highly-available compute nodes into their deployment
 infrastructure. But since practically all of them already include
 Pacemaker, the additional scaffolding required is actually rather
 limited.


 Or:

 (5) Let monitoring and orchestration services deal with these use cases and
 have Nova simply provide the primitive API calls that it already does (i.e.
 host evacuate).

That would arguably lead to an incredible amount of wheel reinvention
for node failure detection, service failure detection, etc. etc.

Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Florian Haas
On Wed, Oct 15, 2014 at 10:03 PM, Russell Bryant rbry...@redhat.com wrote:
 Am I making sense?

 Yep, the downside is just that you need to provide a new set of flavors
 for ha vs non-ha.  A benefit though is that it's a way to support it
 today without *any* changes to OpenStack.

Users are already very used to defining new flavors. Nova itself
wouldn't even need to define those; if the vendor's deployment tools
defined them it would be just fine.

 This seems like the kind of thing we should also figure out how to offer
 on a per-guest basis without needing a new set of flavors.  That's why I
 also listed the server tagging functionality as another possible solution.

This still doesn't do away with the requirement to reliably detect
node failure, and to fence misbehaving nodes. Detecting that a node
has failed, and fencing it if unsure, is a prerequisite for any
recovery action. So you need Corosync/Pacemaker anyway.

Note also that when using an approach where you have physically
clustered nodes, but you are also running non-HA VMs on those, then
the user must understand that the following applies:

(1) If your guest is marked HA, then it will automatically recover on
node failure, but
(2) if your guest is *not* marked HA, then it will go down with the
node not only if it fails, but also if it is fenced.

So a non-HA guest on an HA node group actually has a slightly
*greater* chance of going down than a non-HA guest on a non-HA host.
(And let's not get into don't use fencing then; we all know why
that's a bad idea.)

Which is why I think it makes sense to just distinguish between
HA-capable and non-HA-capable hosts, and have the user decide whether
they want HA or non-HA guests simply by assigning them to the
appropriate host aggregates.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Adam Lawson
It would seem to me that if guest HA is highly-desired, and it is,
requiring multiple flavors for multiple SLA requirements (and that's what
we're really talking about) introduces a trade-off that conceivably isn't
needed - double the flavor requirement for the same spec (512/1/10 and
another for HA). I'd like to explore this a little further to define other
possibilities.

I like the idea of instance HA; I like the idea of host HA way better
because it protects every instance on it. And hosts with HA logic would
obviously not be allowed to only host instances that use shared storage.

What are our options to continue discussing in Paris?


*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Wed, Oct 15, 2014 at 1:50 PM, Florian Haas flor...@hastexo.com wrote:

 On Wed, Oct 15, 2014 at 9:58 PM, Jay Pipes jaypi...@gmail.com wrote:
  On 10/15/2014 03:16 PM, Florian Haas wrote:
 
  On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com
  wrote:
 
  On 10/13/2014 05:59 PM, Russell Bryant wrote:
 
  Nice timing.  I was working on a blog post on this topic.
 
 
  which is now here:
 
 
 http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/
 
 
  I am absolutely loving the fact that we are finally having a
  discussion in earnest about this. i think this deserves a Design
  Summit session.
 
  If I may weigh in here, let me share what I've seen users do and what
  can currently be done, and what may be supported in the future.
 
  Problem: automatically ensure that a Nova guest continues to run, even
  if its host fails.
 
  (That's the general problem description and I don't need to go into
  further details explaining the problem, because Russell has done that
  beautifully in his blog post.)
 
  Now, what are the options?
 
  (1) Punt and leave it to the hypervisor.
 
  This essentially means that you must use a hypervisor that already has
  HA built in, such as VMware with the VCenter driver. In that scenario,
  Nova itself neither deals with HA, nor exposes any HA switches to the
  user. Obvious downside: not generic, doesn't work with all
  hypervisors, most importantly doesn't work with the most popular one
  (libvirt/KVM).
 
  (2) Deploy Nova nodes in pairs/groups, and pretend that they are one
 node.
 
  You can already do that by overriding host in nova-compute.conf,
  setting resume_guests_state_on_host_boot, and using VIPs with
  Corosync/Pacemaker. You can then group these hosts in host aggregates,
  and the user's scheduler hint to point a newly scheduled guest to such
  a host aggregate becomes, effectively, the keep this guest running at
  all times flag. Upside: no changes to Nova at all, monitoring,
  fencing and recovery for free from Corosync/Pacemaker. Downsides:
  requires vendors to automate Pacemaker configuration in deployment
  tools (because you really don't want to do those things manually).
  Additional downside: you either have some idle hardware, or you might
  be overcommitting resources in case of failover.
 
  (3) Automatic host evacuation.
 
  Not supported in Nova right now, as Adam pointed out at the top of the
  thread, and repeatedly shot down. If someone were to implement this,
  it would *still* require that Corosync/Pacemaker be used for
  monitoring and fencing of nodes, because re-implementing this from
  scratch would be the reinvention of a wheel while painting a bikeshed.
 
  (4) Per-guest HA.
 
  This is the idea of just doing nova boot --keep-this running, i.e.
  setting a per-guest flag that still means the machine is to be kept up
  at all times. Again, not supported in Nova right now, and probably
  even more complex to implement generically than (3), at the same or
  greater cost.
 
  I have a suggestion to tackle this that I *think* is reasonably
  user-friendly while still bearable in terms of Nova development
  effort:
 
  (a) Define a well-known metadata key for a host aggregate, say ha.
  Define that any host aggregate that represents a highly available
  group of compute nodes should have this metadata key set.
 
  (b) Then define a flavor that sets extra_specs ha=true.
 
  Granted, this places an additional burden on distro vendors to
  integrate highly-available compute nodes into their deployment
  infrastructure. But since practically all of them already include
  Pacemaker, the additional scaffolding required is actually rather
  limited.
 
 
  Or:
 
  (5) Let monitoring and orchestration services deal with these use cases
 and
  have Nova simply provide the primitive API calls that it already does
 (i.e.
  host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.

 Florian

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Jay Pipes



On 10/15/2014 04:50 PM, Florian Haas wrote:

On Wed, Oct 15, 2014 at 9:58 PM, Jay Pipes jaypi...@gmail.com wrote:

On 10/15/2014 03:16 PM, Florian Haas wrote:


On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com
wrote:


On 10/13/2014 05:59 PM, Russell Bryant wrote:


Nice timing.  I was working on a blog post on this topic.



which is now here:

http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/



I am absolutely loving the fact that we are finally having a
discussion in earnest about this. i think this deserves a Design
Summit session.

If I may weigh in here, let me share what I've seen users do and what
can currently be done, and what may be supported in the future.

Problem: automatically ensure that a Nova guest continues to run, even
if its host fails.

(That's the general problem description and I don't need to go into
further details explaining the problem, because Russell has done that
beautifully in his blog post.)

Now, what are the options?

(1) Punt and leave it to the hypervisor.

This essentially means that you must use a hypervisor that already has
HA built in, such as VMware with the VCenter driver. In that scenario,
Nova itself neither deals with HA, nor exposes any HA switches to the
user. Obvious downside: not generic, doesn't work with all
hypervisors, most importantly doesn't work with the most popular one
(libvirt/KVM).

(2) Deploy Nova nodes in pairs/groups, and pretend that they are one node.

You can already do that by overriding host in nova-compute.conf,
setting resume_guests_state_on_host_boot, and using VIPs with
Corosync/Pacemaker. You can then group these hosts in host aggregates,
and the user's scheduler hint to point a newly scheduled guest to such
a host aggregate becomes, effectively, the keep this guest running at
all times flag. Upside: no changes to Nova at all, monitoring,
fencing and recovery for free from Corosync/Pacemaker. Downsides:
requires vendors to automate Pacemaker configuration in deployment
tools (because you really don't want to do those things manually).
Additional downside: you either have some idle hardware, or you might
be overcommitting resources in case of failover.

(3) Automatic host evacuation.

Not supported in Nova right now, as Adam pointed out at the top of the
thread, and repeatedly shot down. If someone were to implement this,
it would *still* require that Corosync/Pacemaker be used for
monitoring and fencing of nodes, because re-implementing this from
scratch would be the reinvention of a wheel while painting a bikeshed.

(4) Per-guest HA.

This is the idea of just doing nova boot --keep-this running, i.e.
setting a per-guest flag that still means the machine is to be kept up
at all times. Again, not supported in Nova right now, and probably
even more complex to implement generically than (3), at the same or
greater cost.

I have a suggestion to tackle this that I *think* is reasonably
user-friendly while still bearable in terms of Nova development
effort:

(a) Define a well-known metadata key for a host aggregate, say ha.
Define that any host aggregate that represents a highly available
group of compute nodes should have this metadata key set.

(b) Then define a flavor that sets extra_specs ha=true.

Granted, this places an additional burden on distro vendors to
integrate highly-available compute nodes into their deployment
infrastructure. But since practically all of them already include
Pacemaker, the additional scaffolding required is actually rather
limited.



Or:

(5) Let monitoring and orchestration services deal with these use cases and
have Nova simply provide the primitive API calls that it already does (i.e.
host evacuate).


That would arguably lead to an incredible amount of wheel reinvention
for node failure detection, service failure detection, etc. etc.


How so? (5) would use existing wheels for monitoring and orchestration 
instead of writing all new code paths inside Nova to do the same thing.


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Russell Bryant
On 10/15/2014 06:30 PM, Jay Pipes wrote:
 
 
 On 10/15/2014 04:50 PM, Florian Haas wrote:
 On Wed, Oct 15, 2014 at 9:58 PM, Jay Pipes jaypi...@gmail.com wrote:
 On 10/15/2014 03:16 PM, Florian Haas wrote:

 On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant rbry...@redhat.com
 wrote:

 On 10/13/2014 05:59 PM, Russell Bryant wrote:

 Nice timing.  I was working on a blog post on this topic.


 which is now here:

 http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/



 I am absolutely loving the fact that we are finally having a
 discussion in earnest about this. i think this deserves a Design
 Summit session.

 If I may weigh in here, let me share what I've seen users do and what
 can currently be done, and what may be supported in the future.

 Problem: automatically ensure that a Nova guest continues to run, even
 if its host fails.

 (That's the general problem description and I don't need to go into
 further details explaining the problem, because Russell has done that
 beautifully in his blog post.)

 Now, what are the options?

 (1) Punt and leave it to the hypervisor.

 This essentially means that you must use a hypervisor that already has
 HA built in, such as VMware with the VCenter driver. In that scenario,
 Nova itself neither deals with HA, nor exposes any HA switches to the
 user. Obvious downside: not generic, doesn't work with all
 hypervisors, most importantly doesn't work with the most popular one
 (libvirt/KVM).

 (2) Deploy Nova nodes in pairs/groups, and pretend that they are one
 node.

 You can already do that by overriding host in nova-compute.conf,
 setting resume_guests_state_on_host_boot, and using VIPs with
 Corosync/Pacemaker. You can then group these hosts in host aggregates,
 and the user's scheduler hint to point a newly scheduled guest to such
 a host aggregate becomes, effectively, the keep this guest running at
 all times flag. Upside: no changes to Nova at all, monitoring,
 fencing and recovery for free from Corosync/Pacemaker. Downsides:
 requires vendors to automate Pacemaker configuration in deployment
 tools (because you really don't want to do those things manually).
 Additional downside: you either have some idle hardware, or you might
 be overcommitting resources in case of failover.

 (3) Automatic host evacuation.

 Not supported in Nova right now, as Adam pointed out at the top of the
 thread, and repeatedly shot down. If someone were to implement this,
 it would *still* require that Corosync/Pacemaker be used for
 monitoring and fencing of nodes, because re-implementing this from
 scratch would be the reinvention of a wheel while painting a bikeshed.

 (4) Per-guest HA.

 This is the idea of just doing nova boot --keep-this running, i.e.
 setting a per-guest flag that still means the machine is to be kept up
 at all times. Again, not supported in Nova right now, and probably
 even more complex to implement generically than (3), at the same or
 greater cost.

 I have a suggestion to tackle this that I *think* is reasonably
 user-friendly while still bearable in terms of Nova development
 effort:

 (a) Define a well-known metadata key for a host aggregate, say ha.
 Define that any host aggregate that represents a highly available
 group of compute nodes should have this metadata key set.

 (b) Then define a flavor that sets extra_specs ha=true.

 Granted, this places an additional burden on distro vendors to
 integrate highly-available compute nodes into their deployment
 infrastructure. But since practically all of them already include
 Pacemaker, the additional scaffolding required is actually rather
 limited.


 Or:

 (5) Let monitoring and orchestration services deal with these use
 cases and
 have Nova simply provide the primitive API calls that it already does
 (i.e.
 host evacuate).

 That would arguably lead to an incredible amount of wheel reinvention
 for node failure detection, service failure detection, etc. etc.
 
 How so? (5) would use existing wheels for monitoring and orchestration
 instead of writing all new code paths inside Nova to do the same thing.

Right, there may be some confusion here ... I thought you were both
agreeing that the use of an external toolset was a good approach for the
problem, but Florian's last message makes that not so clear ...

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-15 Thread Russell Bryant
On 10/15/2014 05:07 PM, Florian Haas wrote:
 On Wed, Oct 15, 2014 at 10:03 PM, Russell Bryant rbry...@redhat.com wrote:
 Am I making sense?

 Yep, the downside is just that you need to provide a new set of flavors
 for ha vs non-ha.  A benefit though is that it's a way to support it
 today without *any* changes to OpenStack.
 
 Users are already very used to defining new flavors. Nova itself
 wouldn't even need to define those; if the vendor's deployment tools
 defined them it would be just fine.

Yes, I know Nova wouldn't need to define it.  I was saying I didn't like
that it was required at all.

 This seems like the kind of thing we should also figure out how to offer
 on a per-guest basis without needing a new set of flavors.  That's why I
 also listed the server tagging functionality as another possible solution.
 
 This still doesn't do away with the requirement to reliably detect
 node failure, and to fence misbehaving nodes. Detecting that a node
 has failed, and fencing it if unsure, is a prerequisite for any
 recovery action. So you need Corosync/Pacemaker anyway.

Obviously, yes.  My post covered all of that directly ... the tagging
bit was just additional input into the recovery operation.

 Note also that when using an approach where you have physically
 clustered nodes, but you are also running non-HA VMs on those, then
 the user must understand that the following applies:
 
 (1) If your guest is marked HA, then it will automatically recover on
 node failure, but
 (2) if your guest is *not* marked HA, then it will go down with the
 node not only if it fails, but also if it is fenced.
 
 So a non-HA guest on an HA node group actually has a slightly
 *greater* chance of going down than a non-HA guest on a non-HA host.
 (And let's not get into don't use fencing then; we all know why
 that's a bad idea.)
 
 Which is why I think it makes sense to just distinguish between
 HA-capable and non-HA-capable hosts, and have the user decide whether
 they want HA or non-HA guests simply by assigning them to the
 appropriate host aggregates.

Very good point.  I hadn't considered that.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Sylvain Bauza


Le 14/10/2014 01:46, Adam Lawson a écrit :


/I think Adam is talking about this bp:
https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically/


Correct - yes. Sorry about that. ; )

So it would seem the question is not whether to support auto-evac but 
how it should be handled. If not handled by Nova, it gets complicated. 
Asking a user to configure a custom Nagios trigger/action... not sure 
if we'd recommend that as our definition of ideal.


  * I can foresee Congress being used to control whether auto-evac is
required and what other policies come into play by virtue of an
unplanned host removal from service. But that seems like a bit
overkill.
  * i can foresee Nova/scheduler being used to perform the evac
itself. Are they still pushing back?
  * I can foresee Ceilometer being used to capture service state and
define how long a node should be inaccessible before it's
considered offline. But seems a bit out of scope for what
ceilometer was meant to do.



Well, IMHO Gantt should just enforce policies (possibly defined by 
Congress or whatever else) so if a condition is not met (here, HA on a 
VM), it should issue a reschedule. That said, Gantt is not responsible 
for polling all events and updating its internal view, that's another 
project which should send those metrics to it.


I'm not having a preference in between Heat, Ceilometer or whatever else 
for notifying Gantt. I even think that whatever the solution would be 
(even a Nagios handler), that's Gantt at the end which would trigger the 
evacuation by calling Nova to fence that compute node and move the VM to 
another host (like rescheduling already does, but in a manual way).



-Sylvain


I'm all about making this super easy to do a simple task though, at 
least so the settings are all defined in one place. Nova seems logical 
but I'm wondering if there is still resistance.


So curious; how are these higher-level discussions 
initiated/facilitated? TC?


I proposed a cross-project session at the Paris Summit about scheduling 
and Gantt (yet to be accepted), that usecase could be discussed there.


-Sylvain



*/
Adam Lawson/*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Mon, Oct 13, 2014 at 3:21 PM, Russell Bryant rbry...@redhat.com 
mailto:rbry...@redhat.com wrote:


On 10/13/2014 06:18 PM, Jay Lau wrote:
 This is also a use case for Congress, please check use case 3 in the
 following link.



https://docs.google.com/document/d/1ExDmT06vDZjzOPePYBqojMRfXodvsk0R8nRkX-zrkSw/edit#

Wow, really?  That honestly makes me very worried about the scope of
Congress being far too big (so early, and maybe period).

--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Jay Pipes

On 10/13/2014 05:59 PM, Russell Bryant wrote:

Nice timing.  I was working on a blog post on this topic.

On 10/13/2014 05:40 PM, Fei Long Wang wrote:

I think Adam is talking about this bp:
https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically

For now, we're using Nagios probe/event to trigger the Nova evacuate
command, but I think it's possible to do that in Nova if we can find a
good way to define the trigger policy.


I actually think that's the right way to do it.


+1. Not everything needs to be built-in to Nova. This very much sounds 
like something that should be handled by PaaS-layer things that can 
react to a Nagios notification (or any other event) and take some sort 
of action, possibly using administrative commands like nova evacuate.


 There are a couple of

other things to consider:

1) An ideal solution also includes fencing.  When you evacuate, you want
to make sure you've fenced the original compute node.  You need to make
absolutely sure that the same VM can't be running more than once,
especially when the disks are backed by shared storage.

Because of the fencing requirement, another option would be to use
Pacemaker to orchestrate this whole thing.  Historically Pacemaker
hasn't been suitable to scale to the number of compute nodes an
OpenStack deployment might have, but Pacemaker has a new feature called
pacemaker_remote [1] that may be suitable.

2) Looking forward, there is a lot of demand for doing this on a per
instance basis.  We should decide on a best practice for allowing end
users to indicate whether they would like their VMs automatically
rescued by the infrastructure, or just left down in the case of a
failure.  It could be as simple as a special tag set on an instance [2].


Please note that server instance tagging (thanks for the shout-out, BTW) 
is intended for only user-defined tags, not system-defined metadata 
which is what this sounds like...


Of course, one might implement some external polling/monitoring system 
using server instance tags, which might do a nova list --tag $TAG --host 
$FAILING_HOST, and initiate a migrate for each returned server instance...


Best,
-jay


[1]
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/
[2] https://review.openstack.org/#/c/127281/



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Tim Bell
 -Original Message-
 From: Jay Pipes [mailto:jaypi...@gmail.com]
 Sent: 14 October 2014 19:01
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On 10/13/2014 05:59 PM, Russell Bryant wrote:
  Nice timing.  I was working on a blog post on this topic.
 
  On 10/13/2014 05:40 PM, Fei Long Wang wrote:
  I think Adam is talking about this bp:
  https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automat
  ically
 
  For now, we're using Nagios probe/event to trigger the Nova evacuate
  command, but I think it's possible to do that in Nova if we can find
  a good way to define the trigger policy.
 
  I actually think that's the right way to do it.
 
 +1. Not everything needs to be built-in to Nova. This very much sounds
 like something that should be handled by PaaS-layer things that can react to a
 Nagios notification (or any other event) and take some sort of action, 
 possibly
 using administrative commands like nova evacuate.
 

Nova is also not the right place to do the generic solution as many other parts 
could be involved... neutron and cinder come to mind. Nova needs to provide the 
basic functions but it needs something outside to make it all happen 
transparently.

I would really like a shared solution rather than each deployment doing their 
own and facing identical problems. A best of breed solution which can be 
incrementally improved as we find problems to dget the hypervisor down event, 
to force detach of boot volumes, restart elsewhere and reconfigure floating ips 
with race conditions is needed.

Some standards for tagging is good but we also need some code :-)

Tim

   There are a couple of
  other things to consider:
 
  1) An ideal solution also includes fencing.  When you evacuate, you
  want to make sure you've fenced the original compute node.  You need
  to make absolutely sure that the same VM can't be running more than
  once, especially when the disks are backed by shared storage.
 
  Because of the fencing requirement, another option would be to use
  Pacemaker to orchestrate this whole thing.  Historically Pacemaker
  hasn't been suitable to scale to the number of compute nodes an
  OpenStack deployment might have, but Pacemaker has a new feature
  called pacemaker_remote [1] that may be suitable.
 
  2) Looking forward, there is a lot of demand for doing this on a per
  instance basis.  We should decide on a best practice for allowing end
  users to indicate whether they would like their VMs automatically
  rescued by the infrastructure, or just left down in the case of a
  failure.  It could be as simple as a special tag set on an instance [2].
 
 Please note that server instance tagging (thanks for the shout-out, BTW) is
 intended for only user-defined tags, not system-defined metadata which is what
 this sounds like...
 
 Of course, one might implement some external polling/monitoring system using
 server instance tags, which might do a nova list --tag $TAG --host
 $FAILING_HOST, and initiate a migrate for each returned server instance...
 
 Best,
 -jay
 
  [1]
  http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_R
  emote/ [2] https://review.openstack.org/#/c/127281/
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Russell Bryant
On 10/14/2014 01:01 PM, Jay Pipes wrote:
 2) Looking forward, there is a lot of demand for doing this on a per
 instance basis.  We should decide on a best practice for allowing end
 users to indicate whether they would like their VMs automatically
 rescued by the infrastructure, or just left down in the case of a
 failure.  It could be as simple as a special tag set on an instance [2].
 
 Please note that server instance tagging (thanks for the shout-out, BTW)
 is intended for only user-defined tags, not system-defined metadata
 which is what this sounds like...

I was envisioning the tag being set by the end user to say please keep
my VM running until I say otherwise, or something like auto-recover
for short.

So, it's specified by the end user, but potentially acted upon by the
system (as you say below).

 Of course, one might implement some external polling/monitoring system
 using server instance tags, which might do a nova list --tag $TAG --host
 $FAILING_HOST, and initiate a migrate for each returned server instance...

Yeah, that's what I was thinking.  Whatever system you use to react to a
failing host could use the tag as part of the criteria to figure out
which instances to evacuate and which to leave as dead.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Mathieu Gagné

On 2014-10-14 2:49 PM, Tim Bell wrote:


Nova is also not the right place to do the generic solution as many other parts 
could be involved... neutron and cinder come to mind. Nova needs to provide the 
basic functions but it needs something outside to make it all happen 
transparently.

I would really like a shared solution rather than each deployment doing their 
own and facing identical problems. A best of breed solution which can be 
incrementally improved as we find problems to dget the hypervisor down event, 
to force detach of boot volumes, restart elsewhere and reconfigure floating ips 
with race conditions is needed.

Some standards for tagging is good but we also need some code :-)



I agree with Tim. Nova does not have all the required information to 
make a proper decision which could imply other OpenStack (and 
non-OpenStack) services. Furthermore, evacuating a node might imply 
fencing which Nova might not be able to do properly or have the proper 
tooling. What about non-shared storage backend in Nova? You can't 
evacuate those without data loss.


--
Mathieu

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-14 Thread Adam Lawson

 Nova is also not the right place to do the generic solution as many other
 parts could be involved... neutron and cinder come to mind. Nova needs to
 provide the basic functions but it needs something outside to make it all
 happen transparently.
 I would really like a shared solution rather than each deployment doing
 their own and facing identical problems. A best of breed solution which can
 be incrementally improved as we find problems to get the hypervisor down
 event, to force detach of boot volumes, restart elsewhere and reconfigure
 floating ips with race conditions is needed.
 Some standards for tagging is good but we also need some code :-)


I think this would actually be a worthwhile cross-project effort but
getting it done would require some higher-level guidance to keep it on
track.

I also do not believe Nova *contains* all of the data to perform auto-evac,
but it has *access* to the data right? Or could anyway. I think Cinder
would definitely play a roll, and Neutron for sure.

And as far is scope is concerned, I personally think something like this
should only support VM's with shared storage. Otherwise, phase 1 gets
overly complex and gets into something akin to VMware's DRS which I DO
think could be another step, but the first step needs to be clean to ensure
it gets done.


*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Tue, Oct 14, 2014 at 12:01 PM, Mathieu Gagné mga...@iweb.com wrote:

 On 2014-10-14 2:49 PM, Tim Bell wrote:


 Nova is also not the right place to do the generic solution as many other
 parts could be involved... neutron and cinder come to mind. Nova needs to
 provide the basic functions but it needs something outside to make it all
 happen transparently.

 I would really like a shared solution rather than each deployment doing
 their own and facing identical problems. A best of breed solution which can
 be incrementally improved as we find problems to dget the hypervisor down
 event, to force detach of boot volumes, restart elsewhere and reconfigure
 floating ips with race conditions is needed.

 Some standards for tagging is good but we also need some code :-)


 I agree with Tim. Nova does not have all the required information to make
 a proper decision which could imply other OpenStack (and non-OpenStack)
 services. Furthermore, evacuating a node might imply fencing which Nova
 might not be able to do properly or have the proper tooling. What about
 non-shared storage backend in Nova? You can't evacuate those without data
 loss.

 --
 Mathieu


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Adam Lawson
[switching to openstack-dev]

Has anyone automated nova evacuate so that VM's on a failed compute host
using shared storage are automatically moved onto a new host or is manually
entering *nova compute instance host* required in all cases?

If it's manual only or require custom Heat/Ceilometer templates, how hard
would it be to enable automatic evacuation within Novs?

i.e. (within /etc/nova/nova.conf)
auto_evac = true

Or is this possible now and I've simply not run across it?


*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Sat, Sep 27, 2014 at 12:32 AM, Clint Byrum cl...@fewbar.com wrote:

 So, what you're looking for is basically the same old IT, but with an
 API. I get that. For me, the point of this cloud thing is so that server
 operators can make _reasonable_ guarantees, and application operators
 can make use of them in an automated fashion.

 If you start guaranteeing 4 and 5 nines for single VM's, you're right
 back in the boat of spending a lot on server infrastructure even if your
 users could live without it sometimes.

 Compute hosts are going to go down. Networks are going to partition. It
 is not actually expensive to deal with that at the application layer. In
 fact when you know your business rules, you'll do a better job at doing
 this efficiently than some blanket replicate all the things layer might.

 I know, some clouds are just new ways to chop up these fancy 40 core
 megaservers that everyone is shipping. I'm sure OpenStack can do it, but
 I'm saying, I don't think OpenStack _should_ do it.

 Excerpts from Adam Lawson's message of 2014-09-26 20:30:29 -0700:
  Generally speaking that's true when you have full control over how you
  deploy applications as a consumer. As a provider however, cloud
 resiliency
  is king and it's generally frowned upon to associate instances directly
 to
  the underlying physical hardware for any reason. It's good when instances
  can come and go as needed, but in a production context, a failed compute
  host shouldn't take down every instance hosted on it. Otherwise there is
 no
  real abstraction going on and the cloud loses immense value.
  On Sep 26, 2014 4:15 PM, Clint Byrum cl...@fewbar.com wrote:
 
   Excerpts from Adam Lawson's message of 2014-09-26 14:43:40 -0700:
Hello fellow stackers.
   
I'm looking for discussions/plans re VM continuity.
   
I.e. Protection for instances using ephemeral storage against host
   failures
or auto-failover capability for instances on hosts where the host
 suffers
from an attitude problem?
   
I know fail-overs are supported and I'm quite certain
 auto-fail-overs are
possible in the event of a host failure (hosting instances not using
   shared
storage). I just can't find where this has been addressed/discussed.
   
Someone help a brother out? ; )
  
   I'm sure some of that is possible, but it's a cloud, so why not do
 things
   the cloud way?
  
   Spin up redundant bits in disparate availability zones. Replicate only
   what must be replicated. Use volumes for DR only when replication would
   be too expensive.
  
   Instances are cattle, not pets. Keep them alive just long enough to
 make
   your profit.
  
   ___
   Mailing list:
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
   Post to : openst...@lists.openstack.org
   Unsubscribe :
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
  

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Adam Lawson
Looks like this was proposed and denied to be part of Nova for some reason
last year. Thoughts on why and is the reasoning (whatever it was) still
applicable?


*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Mon, Oct 13, 2014 at 1:26 PM, Adam Lawson alaw...@aqorn.com wrote:

 [switching to openstack-dev]

 Has anyone automated nova evacuate so that VM's on a failed compute host
 using shared storage are automatically moved onto a new host or is manually
 entering *nova compute instance host* required in all cases?

 If it's manual only or require custom Heat/Ceilometer templates, how hard
 would it be to enable automatic evacuation within Novs?

 i.e. (within /etc/nova/nova.conf)
 auto_evac = true

 Or is this possible now and I've simply not run across it?


 *Adam Lawson*

 AQORN, Inc.
 427 North Tatnall Street
 Ste. 58461
 Wilmington, Delaware 19801-2230
 Toll-free: (844) 4-AQORN-NOW ext. 101
 International: +1 302-387-4660
 Direct: +1 916-246-2072


 On Sat, Sep 27, 2014 at 12:32 AM, Clint Byrum cl...@fewbar.com wrote:

 So, what you're looking for is basically the same old IT, but with an
 API. I get that. For me, the point of this cloud thing is so that server
 operators can make _reasonable_ guarantees, and application operators
 can make use of them in an automated fashion.

 If you start guaranteeing 4 and 5 nines for single VM's, you're right
 back in the boat of spending a lot on server infrastructure even if your
 users could live without it sometimes.

 Compute hosts are going to go down. Networks are going to partition. It
 is not actually expensive to deal with that at the application layer. In
 fact when you know your business rules, you'll do a better job at doing
 this efficiently than some blanket replicate all the things layer might.

 I know, some clouds are just new ways to chop up these fancy 40 core
 megaservers that everyone is shipping. I'm sure OpenStack can do it, but
 I'm saying, I don't think OpenStack _should_ do it.

 Excerpts from Adam Lawson's message of 2014-09-26 20:30:29 -0700:
  Generally speaking that's true when you have full control over how you
  deploy applications as a consumer. As a provider however, cloud
 resiliency
  is king and it's generally frowned upon to associate instances directly
 to
  the underlying physical hardware for any reason. It's good when
 instances
  can come and go as needed, but in a production context, a failed compute
  host shouldn't take down every instance hosted on it. Otherwise there
 is no
  real abstraction going on and the cloud loses immense value.
  On Sep 26, 2014 4:15 PM, Clint Byrum cl...@fewbar.com wrote:
 
   Excerpts from Adam Lawson's message of 2014-09-26 14:43:40 -0700:
Hello fellow stackers.
   
I'm looking for discussions/plans re VM continuity.
   
I.e. Protection for instances using ephemeral storage against host
   failures
or auto-failover capability for instances on hosts where the host
 suffers
from an attitude problem?
   
I know fail-overs are supported and I'm quite certain
 auto-fail-overs are
possible in the event of a host failure (hosting instances not using
   shared
storage). I just can't find where this has been addressed/discussed.
   
Someone help a brother out? ; )
  
   I'm sure some of that is possible, but it's a cloud, so why not do
 things
   the cloud way?
  
   Spin up redundant bits in disparate availability zones. Replicate only
   what must be replicated. Use volumes for DR only when replication
 would
   be too expensive.
  
   Instances are cattle, not pets. Keep them alive just long enough to
 make
   your profit.
  
   ___
   Mailing list:
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
   Post to : openst...@lists.openstack.org
   Unsubscribe :
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
  



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Joe Gordon
On Mon, Oct 13, 2014 at 1:32 PM, Adam Lawson alaw...@aqorn.com wrote:

 Looks like this was proposed and denied to be part of Nova for some reason
 last year. Thoughts on why and is the reasoning (whatever it was) still
 applicable?


Link?





 *Adam Lawson*

 AQORN, Inc.
 427 North Tatnall Street
 Ste. 58461
 Wilmington, Delaware 19801-2230
 Toll-free: (844) 4-AQORN-NOW ext. 101
 International: +1 302-387-4660
 Direct: +1 916-246-2072


 On Mon, Oct 13, 2014 at 1:26 PM, Adam Lawson alaw...@aqorn.com wrote:

 [switching to openstack-dev]

 Has anyone automated nova evacuate so that VM's on a failed compute host
 using shared storage are automatically moved onto a new host or is manually
 entering *nova compute instance host* required in all cases?

 If it's manual only or require custom Heat/Ceilometer templates, how hard
 would it be to enable automatic evacuation within Novs?

 i.e. (within /etc/nova/nova.conf)
 auto_evac = true

 Or is this possible now and I've simply not run across it?


 *Adam Lawson*

 AQORN, Inc.
 427 North Tatnall Street
 Ste. 58461
 Wilmington, Delaware 19801-2230
 Toll-free: (844) 4-AQORN-NOW ext. 101
 International: +1 302-387-4660
 Direct: +1 916-246-2072


 On Sat, Sep 27, 2014 at 12:32 AM, Clint Byrum cl...@fewbar.com wrote:

 So, what you're looking for is basically the same old IT, but with an
 API. I get that. For me, the point of this cloud thing is so that server
 operators can make _reasonable_ guarantees, and application operators
 can make use of them in an automated fashion.

 If you start guaranteeing 4 and 5 nines for single VM's, you're right
 back in the boat of spending a lot on server infrastructure even if your
 users could live without it sometimes.

 Compute hosts are going to go down. Networks are going to partition. It
 is not actually expensive to deal with that at the application layer. In
 fact when you know your business rules, you'll do a better job at doing
 this efficiently than some blanket replicate all the things layer
 might.

 I know, some clouds are just new ways to chop up these fancy 40 core
 megaservers that everyone is shipping. I'm sure OpenStack can do it, but
 I'm saying, I don't think OpenStack _should_ do it.

 Excerpts from Adam Lawson's message of 2014-09-26 20:30:29 -0700:
  Generally speaking that's true when you have full control over how you
  deploy applications as a consumer. As a provider however, cloud
 resiliency
  is king and it's generally frowned upon to associate instances
 directly to
  the underlying physical hardware for any reason. It's good when
 instances
  can come and go as needed, but in a production context, a failed
 compute
  host shouldn't take down every instance hosted on it. Otherwise there
 is no
  real abstraction going on and the cloud loses immense value.
  On Sep 26, 2014 4:15 PM, Clint Byrum cl...@fewbar.com wrote:
 
   Excerpts from Adam Lawson's message of 2014-09-26 14:43:40 -0700:
Hello fellow stackers.
   
I'm looking for discussions/plans re VM continuity.
   
I.e. Protection for instances using ephemeral storage against host
   failures
or auto-failover capability for instances on hosts where the host
 suffers
from an attitude problem?
   
I know fail-overs are supported and I'm quite certain
 auto-fail-overs are
possible in the event of a host failure (hosting instances not
 using
   shared
storage). I just can't find where this has been
 addressed/discussed.
   
Someone help a brother out? ; )
  
   I'm sure some of that is possible, but it's a cloud, so why not do
 things
   the cloud way?
  
   Spin up redundant bits in disparate availability zones. Replicate
 only
   what must be replicated. Use volumes for DR only when replication
 would
   be too expensive.
  
   Instances are cattle, not pets. Keep them alive just long enough to
 make
   your profit.
  
   ___
   Mailing list:
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
   Post to : openst...@lists.openstack.org
   Unsubscribe :
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
  




 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Fei Long Wang
I think Adam is talking about this bp:
https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically

For now, we're using Nagios probe/event to trigger the Nova evacuate 
command, but I think it's possible to do that in Nova if we can find a
good way to define the trigger policy.


On 14/10/14 10:15, Joe Gordon wrote:


 On Mon, Oct 13, 2014 at 1:32 PM, Adam Lawson alaw...@aqorn.com
 mailto:alaw...@aqorn.com wrote:

 Looks like this was proposed and denied to be part of Nova for
 some reason last year. Thoughts on why and is the reasoning
 (whatever it was) still applicable?


 Link?

  


 */
 Adam Lawson/*

 AQORN, Inc.
 427 North Tatnall Street
 Ste. 58461
 Wilmington, Delaware 19801-2230
 Toll-free: (844) 4-AQORN-NOW ext. 101
 International: +1 302-387-4660 tel:%2B1%20302-387-4660
 Direct: +1 916-246-2072 tel:%2B1%20916-246-2072


 On Mon, Oct 13, 2014 at 1:26 PM, Adam Lawson alaw...@aqorn.com
 mailto:alaw...@aqorn.com wrote:

 [switching to openstack-dev]

 Has anyone automated nova evacuate so that VM's on a failed
 compute host using shared storage are automatically moved onto
 a new host or is manually entering /nova compute instance
 host/ required in all cases?

 If it's manual only or require custom Heat/Ceilometer
 templates, how hard would it be to enable automatic evacuation
 within Novs?

 i.e. (within /etc/nova/nova.conf)
 auto_evac = true

 Or is this possible now and I've simply not run across it?

 */
 Adam Lawson/*

 AQORN, Inc.
 427 North Tatnall Street
 Ste. 58461
 Wilmington, Delaware 19801-2230
 Toll-free: (844) 4-AQORN-NOW ext. 101
 International: +1 302-387-4660 tel:%2B1%20302-387-4660
 Direct: +1 916-246-2072 tel:%2B1%20916-246-2072


 On Sat, Sep 27, 2014 at 12:32 AM, Clint Byrum
 cl...@fewbar.com mailto:cl...@fewbar.com wrote:

 So, what you're looking for is basically the same old IT,
 but with an
 API. I get that. For me, the point of this cloud thing is
 so that server
 operators can make _reasonable_ guarantees, and
 application operators
 can make use of them in an automated fashion.

 If you start guaranteeing 4 and 5 nines for single VM's,
 you're right
 back in the boat of spending a lot on server
 infrastructure even if your
 users could live without it sometimes.

 Compute hosts are going to go down. Networks are going to
 partition. It
 is not actually expensive to deal with that at the
 application layer. In
 fact when you know your business rules, you'll do a better
 job at doing
 this efficiently than some blanket replicate all the
 things layer might.

 I know, some clouds are just new ways to chop up these
 fancy 40 core
 megaservers that everyone is shipping. I'm sure OpenStack
 can do it, but
 I'm saying, I don't think OpenStack _should_ do it.

 Excerpts from Adam Lawson's message of 2014-09-26 20:30:29
 -0700:
  Generally speaking that's true when you have full
 control over how you
  deploy applications as a consumer. As a provider
 however, cloud resiliency
  is king and it's generally frowned upon to associate
 instances directly to
  the underlying physical hardware for any reason. It's
 good when instances
  can come and go as needed, but in a production context,
 a failed compute
  host shouldn't take down every instance hosted on it.
 Otherwise there is no
  real abstraction going on and the cloud loses immense value.
  On Sep 26, 2014 4:15 PM, Clint Byrum cl...@fewbar.com
 mailto:cl...@fewbar.com wrote:
 
   Excerpts from Adam Lawson's message of 2014-09-26
 14:43:40 -0700:
Hello fellow stackers.
   
I'm looking for discussions/plans re VM continuity.
   
I.e. Protection for instances using ephemeral
 storage against host
   failures
or auto-failover capability for instances on hosts
 where the host suffers
from an attitude problem?
   
I know fail-overs are supported and I'm quite
 certain auto-fail-overs are
possible in the event of a host failure (hosting
 instances not using
   shared
storage). I just can't find where this has been
   

Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Russell Bryant
Nice timing.  I was working on a blog post on this topic.

On 10/13/2014 05:40 PM, Fei Long Wang wrote:
 I think Adam is talking about this bp:
 https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically
 
 For now, we're using Nagios probe/event to trigger the Nova evacuate 
 command, but I think it's possible to do that in Nova if we can find a
 good way to define the trigger policy.

I actually think that's the right way to do it.  There are a couple of
other things to consider:

1) An ideal solution also includes fencing.  When you evacuate, you want
to make sure you've fenced the original compute node.  You need to make
absolutely sure that the same VM can't be running more than once,
especially when the disks are backed by shared storage.

Because of the fencing requirement, another option would be to use
Pacemaker to orchestrate this whole thing.  Historically Pacemaker
hasn't been suitable to scale to the number of compute nodes an
OpenStack deployment might have, but Pacemaker has a new feature called
pacemaker_remote [1] that may be suitable.

2) Looking forward, there is a lot of demand for doing this on a per
instance basis.  We should decide on a best practice for allowing end
users to indicate whether they would like their VMs automatically
rescued by the infrastructure, or just left down in the case of a
failure.  It could be as simple as a special tag set on an instance [2].

[1]
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/
[2] https://review.openstack.org/#/c/127281/

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Jay Lau
This is also a use case for Congress, please check use case 3 in the
following link.

https://docs.google.com/document/d/1ExDmT06vDZjzOPePYBqojMRfXodvsk0R8nRkX-zrkSw/edit#

2014-10-14 5:59 GMT+08:00 Russell Bryant rbry...@redhat.com:

 Nice timing.  I was working on a blog post on this topic.

 On 10/13/2014 05:40 PM, Fei Long Wang wrote:
  I think Adam is talking about this bp:
 
 https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically
 
  For now, we're using Nagios probe/event to trigger the Nova evacuate
  command, but I think it's possible to do that in Nova if we can find a
  good way to define the trigger policy.

 I actually think that's the right way to do it.  There are a couple of
 other things to consider:

 1) An ideal solution also includes fencing.  When you evacuate, you want
 to make sure you've fenced the original compute node.  You need to make
 absolutely sure that the same VM can't be running more than once,
 especially when the disks are backed by shared storage.

 Because of the fencing requirement, another option would be to use
 Pacemaker to orchestrate this whole thing.  Historically Pacemaker
 hasn't been suitable to scale to the number of compute nodes an
 OpenStack deployment might have, but Pacemaker has a new feature called
 pacemaker_remote [1] that may be suitable.

 2) Looking forward, there is a lot of demand for doing this on a per
 instance basis.  We should decide on a best practice for allowing end
 users to indicate whether they would like their VMs automatically
 rescued by the infrastructure, or just left down in the case of a
 failure.  It could be as simple as a special tag set on an instance [2].

 [1]

 http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/
 [2] https://review.openstack.org/#/c/127281/

 --
 Russell Bryant

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Thanks,

Jay
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Russell Bryant
On 10/13/2014 06:18 PM, Jay Lau wrote:
 This is also a use case for Congress, please check use case 3 in the
 following link.
 
 https://docs.google.com/document/d/1ExDmT06vDZjzOPePYBqojMRfXodvsk0R8nRkX-zrkSw/edit#

Wow, really?  That honestly makes me very worried about the scope of
Congress being far too big (so early, and maybe period).

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-13 Thread Adam Lawson

 *I think Adam is talking about this
 bp: 
 https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically
 https://blueprints.launchpad.net/nova/+spec/evacuate-instance-automatically 
 *


Correct - yes. Sorry about that. ; )

So it would seem the question is not whether to support auto-evac but how
it should be handled. If not handled by Nova, it gets complicated. Asking a
user to configure a custom Nagios trigger/action... not sure if we'd
recommend that as our definition of ideal.

   - I can foresee Congress being used to control whether auto-evac is
   required and what other policies come into play by virtue of an unplanned
   host removal from service. But that seems like a bit overkill.
   - i can foresee Nova/scheduler being used to perform the evac itself.
   Are they still pushing back?
   - I can foresee Ceilometer being used to capture service state and
   define how long a node should be inaccessible before it's considered
   offline. But seems a bit out of scope for what ceilometer was meant to do.

I'm all about making this super easy to do a simple task though, at least
so the settings are all defined in one place. Nova seems logical but I'm
wondering if there is still resistance.

So curious; how are these higher-level discussions initiated/facilitated?
TC?



*Adam Lawson*

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072


On Mon, Oct 13, 2014 at 3:21 PM, Russell Bryant rbry...@redhat.com wrote:

 On 10/13/2014 06:18 PM, Jay Lau wrote:
  This is also a use case for Congress, please check use case 3 in the
  following link.
 
 
 https://docs.google.com/document/d/1ExDmT06vDZjzOPePYBqojMRfXodvsk0R8nRkX-zrkSw/edit#

 Wow, really?  That honestly makes me very worried about the scope of
 Congress being far too big (so early, and maybe period).

 --
 Russell Bryant

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev