Re: [openstack-dev] [Nova] Automatic evacuate

Géza Gémes Tue, 21 Oct 2014 11:33:13 -0700

On 10/21/2014 07:53 PM, David Vossel wrote:


----- Original Message -----

-----Original Message-----
From: Russell Bryant [mailto:[email protected]]
Sent: October 21, 2014 15:07
To: [email protected]
Subject: Re: [openstack-dev] [Nova] Automatic evacuate

On 10/21/2014 06:44 AM, Balázs Gibizer wrote:

Hi,

Sorry for the top posting but it was hard to fit my complete view inline.

I'm also thinking about a possible solution for automatic server
evacuation. I see two separate sub problems of this problem:
1)compute node monitoring and fencing, 2)automatic server evacuation

Compute node monitoring is currently implemented in servicegroup
module of nova. As far as I understand pacemaker is the proposed
solution in this thread to solve both monitoring and fencing but we
tried and found out that pacemaker_remote on baremetal does not work
together with fencing (yet), see [1]. So if we need fencing then
either we have to go for normal pacemaker instead of pacemaker_remote
but that solution doesn't scale or we configure and call stonith
directly when pacemaker detect the compute node failure.

I didn't get the same conclusion from the link you reference.  It says:

"That is not to say however that fencing of a baremetal node works any
differently than that of a normal cluster-node. The Pacemaker policy engine
understands how to fence baremetal remote-nodes. As long as a fencing
device exists, the cluster is capable of ensuring baremetal nodes are
fenced
in the exact same way as normal cluster-nodes are fenced."

So, it sounds like the core pacemaker cluster can fence the node to me.
  I CC'd David Vossel, a pacemaker developer, to see if he can help clarify.

It seems there is a contradiction between chapter 1.5 and 7.2 in [1] as 7.2
states:
" There are some complications involved with understanding a bare-metal
node's state that virtual nodes don't have. Once this logic is complete,
pacemaker will be able to integrate bare-metal nodes in the same way virtual
remote-nodes currently are. Some special considerations for fencing will
need to be addressed. "
Let's wait for David's statement on this.

Hey, That's me!

I can definitely clear all this up.

First off, this document is out of sync with the current state upstream. We're
already past Pacemaker v1.1.12 upstream. Section 7.2 of the document being
referenced is still talking about future v1.1.11 features.

I'll make it simple. If the document references anything that needs to be done
in the future, it's already done.  Pacemaker remote is feature complete at this
point. I've accomplished everything I originally set out to do. I see one change
though. In 7.1 I talk about wanting pacemaker to be able to manage resources in
containers. I mention something about libvirt sandbox. I scrapped whatever I was
doing there. Pacemaker now has docker support.
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/docker

I've known this document is out of date. It's on my giant list of things to do.
Sorry for any confusion.

As far as pacemaker remote and fencing goes, remote-nodes are fenced the exact
same way as cluster-nodes. The only consideration that needs to be made is that
the cluster-nodes (nodes running the full pacemaker+corosync stack) are the only
nodes allowed to initiate fencing. All you have to do is make sure the fencing
devices you want to use to fence remote-nodes are accessible to the 
cluster-nodes.
 From there you are good to go.

Let me know if there's anything else I can clear up. Pacemaker remote was 
designed
to be the solution for the exact scenario you all are discussing here.  Compute 
nodes
and pacemaker remote are made for one another :D

If anyone is interested in prototyping pacemaker remote for this compute node 
use
case, make sure to include me. I have done quite a bit research into how to 
maximize
pacemaker's ability to scale horizontally. As part of that research I've made a 
few
changes that are directly related to all of this that are not yet in an official
pacemaker release.  Come to me for the latest rpms and you'll have a less 
painful
experience setting all this up :)

-- Vossel

Hi Vossel,

Could you send us a link to the source RPMs please, we have tested onCentOS7. It might need a recompile.


Thank you!

Geza

Cheers,
Gibi

--
Russell Bryant

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Automatic evacuate

Reply via email to