Re: [openstack-dev] [nova] Distributed locking

2014-06-25 Thread John Garbutt
So just to keep the ML up with some of the discussion we had in IRC
the other day...

Most resources in Nova are owned by a particular nova-compute. So the
locks on the resources are effectively held by the nova-compute that
owns the resource.

We already effectively have a cross nova-compute lock holding in the
capacity reservations during migrate/resize.

But to cut a long story short, if the image cache is actually just a
copy from one of the nova-compute nodes that already have that image
into the local (shared) folder for another nova-compute, then we can
get away without a global lock, and just have two local locks on
either end and some conducting to co-ordinate things.

Its not perfect, but its an option.

Thanks,
John


On 17 June 2014 18:18, Clint Byrum cl...@fewbar.com wrote:
 Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700:
 On 17/06/14 00:28, Joshua Harlow wrote:
  So this is a reader/write lock then?
 
  I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
  kazoo (zookeeper python library) but there was a lack of a maintainer for
  that 'recipe', perhaps if we really find this needed we can help get that
  pull request 'sponsored' so that it can be used for this purpose?
 
 
  As far as resiliency, the thing I was thinking about was how correct do u
  want this lock to be?
 
  If u say go with memcached and a locking mechanism using it this will not
  be correct but it might work good enough under normal usage. So that¹s why
  I was wondering about what level of correctness do you want and what do
  you want to happen if a server that is maintaining the lock record dies.
  In memcaches case this will literally be 1 server, even if sharding is
  being used, since a key hashes to one server. So if that one server goes
  down (or a network split happens) then it is possible for two entities to
  believe they own the same lock (and if the network split recovers this
  gets even weirder); so that¹s what I was wondering about when mentioning
  resiliency and how much incorrectness you are willing to tolerate.

 From my POV, the most important things are:

 * 2 nodes must never believe they hold the same lock
 * A node must eventually get the lock


 If these are musts, then memcache is a no-go for locking. memcached is
 likely to delete anything it is storing in its RAM, at any time. Also
 if you have several memcache servers, a momentary network blip could
 lead to acquiring the lock erroneously.

 The only thing it is useful for is coalescing, where a broken lock just
 means wasted resources, erroneous errors, etc. If consistency is needed,
 then you need a consistent backend.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-25 Thread Joshua Harlow
Could u expand on this and how it would work.

I'm pretty skeptical of new ad-hoc locking implementations so just want to
ensure it's flushed out in detail.

What would the two local locks be, where would they be, what would the
'conducting' being doing to coordinate?

-Original Message-
From: John Garbutt j...@johngarbutt.com
Reply-To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Date: Wednesday, June 25, 2014 at 1:08 AM
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [nova] Distributed locking

So just to keep the ML up with some of the discussion we had in IRC
the other day...

Most resources in Nova are owned by a particular nova-compute. So the
locks on the resources are effectively held by the nova-compute that
owns the resource.

We already effectively have a cross nova-compute lock holding in the
capacity reservations during migrate/resize.

But to cut a long story short, if the image cache is actually just a
copy from one of the nova-compute nodes that already have that image
into the local (shared) folder for another nova-compute, then we can
get away without a global lock, and just have two local locks on
either end and some conducting to co-ordinate things.

Its not perfect, but its an option.

Thanks,
John


On 17 June 2014 18:18, Clint Byrum cl...@fewbar.com wrote:
 Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700:
 On 17/06/14 00:28, Joshua Harlow wrote:
  So this is a reader/write lock then?
 
  I have seen https://github.com/python-zk/kazoo/pull/141 come up in
the
  kazoo (zookeeper python library) but there was a lack of a
maintainer for
  that 'recipe', perhaps if we really find this needed we can help get
that
  pull request 'sponsored' so that it can be used for this purpose?
 
 
  As far as resiliency, the thing I was thinking about was how correct
do u
  want this lock to be?
 
  If u say go with memcached and a locking mechanism using it this
will not
  be correct but it might work good enough under normal usage. So
that¹s why
  I was wondering about what level of correctness do you want and what
do
  you want to happen if a server that is maintaining the lock record
dies.
  In memcaches case this will literally be 1 server, even if sharding
is
  being used, since a key hashes to one server. So if that one server
goes
  down (or a network split happens) then it is possible for two
entities to
  believe they own the same lock (and if the network split recovers
this
  gets even weirder); so that¹s what I was wondering about when
mentioning
  resiliency and how much incorrectness you are willing to tolerate.

 From my POV, the most important things are:

 * 2 nodes must never believe they hold the same lock
 * A node must eventually get the lock


 If these are musts, then memcache is a no-go for locking. memcached is
 likely to delete anything it is storing in its RAM, at any time. Also
 if you have several memcache servers, a momentary network blip could
 lead to acquiring the lock erroneously.

 The only thing it is useful for is coalescing, where a broken lock just
 means wasted resources, erroneous errors, etc. If consistency is needed,
 then you need a consistent backend.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-17 Thread Matthew Booth
On 17/06/14 00:28, Joshua Harlow wrote:
 So this is a reader/write lock then?
 
 I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
 kazoo (zookeeper python library) but there was a lack of a maintainer for
 that 'recipe', perhaps if we really find this needed we can help get that
 pull request 'sponsored' so that it can be used for this purpose?
 
 
 As far as resiliency, the thing I was thinking about was how correct do u
 want this lock to be?
 
 If u say go with memcached and a locking mechanism using it this will not
 be correct but it might work good enough under normal usage. So that¹s why
 I was wondering about what level of correctness do you want and what do
 you want to happen if a server that is maintaining the lock record dies.
 In memcaches case this will literally be 1 server, even if sharding is
 being used, since a key hashes to one server. So if that one server goes
 down (or a network split happens) then it is possible for two entities to
 believe they own the same lock (and if the network split recovers this
 gets even weirder); so that¹s what I was wondering about when mentioning
 resiliency and how much incorrectness you are willing to tolerate.

From my POV, the most important things are:

* 2 nodes must never believe they hold the same lock
* A node must eventually get the lock

I was expecting to implement locking on all three backends as long as
they support it. I haven't looked closely at memcached, but if it can
detect a split it should be able to have a fencing race with the
possible lock holder before continuing. This is obviously undesirable,
as you will probably be fencing an otherwise correctly functioning node,
but it will be correct.

Matt

 
 -Original Message-
 From: Matthew Booth mbo...@redhat.com
 Organization: Red Hat
 Date: Friday, June 13, 2014 at 1:40 AM
 To: Joshua Harlow harlo...@yahoo-inc.com, OpenStack Development Mailing
 List (not for usage questions) openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova] Distributed locking
 
 On 12/06/14 21:38, Joshua Harlow wrote:
 So just a few thoughts before going to far down this path,

 Can we make sure we really really understand the use-case where we think
 this is needed. I think it's fine that this use-case exists, but I just
 want to make it very clear to others why its needed and why distributing
 locking is the only *correct* way.

 An example use of this would be side-loading an image from another
 node's image cache rather than fetching it from glance, which would have
 very significant performance benefits in the VMware driver, and possibly
 other places. The copier must take a read lock on the image to prevent
 the owner from ageing it during the copy. Holding a read lock would also
 assure the copier that the image it is copying is complete.

 This helps set a good precedent for others that may follow down this
 path
 that they also clearly explain the situation, how distributed locking
 fixes it and all the corner cases that now pop-up with distributed
 locking.

 Some of the questions that I can think of at the current moment:

 * What happens when a node goes down that owns the lock, how does the
 software react to this?

 This can be well defined according to the behaviour of the backend. For
 example, it is well defined in zookeeper when a node's session expires.
 If the lock holder is no longer a valid node, it would be fenced before
 deleting its lock, allowing other nodes to continue.

 Without fencing it would not be possible to safely continue in this case.

 * What resources are being locked; what is the lock target, what is its
 lifetime?

 These are not questions for a locking implementation. A lock would be
 held on a name, and it would be up to the api user to ensure that the
 protected resource is only used while correctly locked, and that the
 lock is not held longer than necessary.

 * What resiliency do you want this lock to provide (this becomes a
 critical question when considering memcached, since memcached is not
 really the best choice for a resilient distributing locking backend)?

 What does resiliency mean in this context? We really just need the lock
 to be correct

 * What do entities that try to acquire a lock do when they can't acquire
 it?

 Typically block, but if a use case emerged for trylock() it would be
 simple to implement. For example, in the image side-loading case we may
 decide that if it isn't possible to immediately acquire the lock it
 isn't worth waiting, and we just fetch it from glance anyway.

 A useful thing I wrote up a while ago, might still be useful:

 https://wiki.openstack.org/wiki/StructuredWorkflowLocks

 Feel free to move that wiki if u find it useful (its sorta a high-level
 doc on the different strategies and such).

 Nice list of implementation pros/cons.

 Matt


 -Josh

 -Original Message-
 From: Matthew Booth mbo...@redhat.com
 Organization: Red Hat
 Reply-To: OpenStack Development

Re: [openstack-dev] [nova] Distributed locking

2014-06-17 Thread Doug Hellmann
On Tue, Jun 17, 2014 at 4:36 AM, Matthew Booth mbo...@redhat.com wrote:
 On 17/06/14 00:28, Joshua Harlow wrote:
 So this is a reader/write lock then?

 I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
 kazoo (zookeeper python library) but there was a lack of a maintainer for
 that 'recipe', perhaps if we really find this needed we can help get that
 pull request 'sponsored' so that it can be used for this purpose?


 As far as resiliency, the thing I was thinking about was how correct do u
 want this lock to be?

 If u say go with memcached and a locking mechanism using it this will not
 be correct but it might work good enough under normal usage. So that¹s why
 I was wondering about what level of correctness do you want and what do
 you want to happen if a server that is maintaining the lock record dies.
 In memcaches case this will literally be 1 server, even if sharding is
 being used, since a key hashes to one server. So if that one server goes
 down (or a network split happens) then it is possible for two entities to
 believe they own the same lock (and if the network split recovers this
 gets even weirder); so that¹s what I was wondering about when mentioning
 resiliency and how much incorrectness you are willing to tolerate.

 From my POV, the most important things are:

 * 2 nodes must never believe they hold the same lock
 * A node must eventually get the lock

 I was expecting to implement locking on all three backends as long as
 they support it. I haven't looked closely at memcached, but if it can
 detect a split it should be able to have a fencing race with the
 possible lock holder before continuing. This is obviously undesirable,
 as you will probably be fencing an otherwise correctly functioning node,
 but it will be correct.

There's a team working on a pluggable library for distributed
coordination: http://git.openstack.org/cgit/stackforge/tooz

Doug


 Matt


 -Original Message-
 From: Matthew Booth mbo...@redhat.com
 Organization: Red Hat
 Date: Friday, June 13, 2014 at 1:40 AM
 To: Joshua Harlow harlo...@yahoo-inc.com, OpenStack Development Mailing
 List (not for usage questions) openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [nova] Distributed locking

 On 12/06/14 21:38, Joshua Harlow wrote:
 So just a few thoughts before going to far down this path,

 Can we make sure we really really understand the use-case where we think
 this is needed. I think it's fine that this use-case exists, but I just
 want to make it very clear to others why its needed and why distributing
 locking is the only *correct* way.

 An example use of this would be side-loading an image from another
 node's image cache rather than fetching it from glance, which would have
 very significant performance benefits in the VMware driver, and possibly
 other places. The copier must take a read lock on the image to prevent
 the owner from ageing it during the copy. Holding a read lock would also
 assure the copier that the image it is copying is complete.

 This helps set a good precedent for others that may follow down this
 path
 that they also clearly explain the situation, how distributed locking
 fixes it and all the corner cases that now pop-up with distributed
 locking.

 Some of the questions that I can think of at the current moment:

 * What happens when a node goes down that owns the lock, how does the
 software react to this?

 This can be well defined according to the behaviour of the backend. For
 example, it is well defined in zookeeper when a node's session expires.
 If the lock holder is no longer a valid node, it would be fenced before
 deleting its lock, allowing other nodes to continue.

 Without fencing it would not be possible to safely continue in this case.

 * What resources are being locked; what is the lock target, what is its
 lifetime?

 These are not questions for a locking implementation. A lock would be
 held on a name, and it would be up to the api user to ensure that the
 protected resource is only used while correctly locked, and that the
 lock is not held longer than necessary.

 * What resiliency do you want this lock to provide (this becomes a
 critical question when considering memcached, since memcached is not
 really the best choice for a resilient distributing locking backend)?

 What does resiliency mean in this context? We really just need the lock
 to be correct

 * What do entities that try to acquire a lock do when they can't acquire
 it?

 Typically block, but if a use case emerged for trylock() it would be
 simple to implement. For example, in the image side-loading case we may
 decide that if it isn't possible to immediately acquire the lock it
 isn't worth waiting, and we just fetch it from glance anyway.

 A useful thing I wrote up a while ago, might still be useful:

 https://wiki.openstack.org/wiki/StructuredWorkflowLocks

 Feel free to move that wiki if u find it useful (its sorta a high-level
 doc on the different

Re: [openstack-dev] [nova] Distributed locking

2014-06-17 Thread Clint Byrum
Excerpts from Matthew Booth's message of 2014-06-17 01:36:11 -0700:
 On 17/06/14 00:28, Joshua Harlow wrote:
  So this is a reader/write lock then?
  
  I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
  kazoo (zookeeper python library) but there was a lack of a maintainer for
  that 'recipe', perhaps if we really find this needed we can help get that
  pull request 'sponsored' so that it can be used for this purpose?
  
  
  As far as resiliency, the thing I was thinking about was how correct do u
  want this lock to be?
  
  If u say go with memcached and a locking mechanism using it this will not
  be correct but it might work good enough under normal usage. So that¹s why
  I was wondering about what level of correctness do you want and what do
  you want to happen if a server that is maintaining the lock record dies.
  In memcaches case this will literally be 1 server, even if sharding is
  being used, since a key hashes to one server. So if that one server goes
  down (or a network split happens) then it is possible for two entities to
  believe they own the same lock (and if the network split recovers this
  gets even weirder); so that¹s what I was wondering about when mentioning
  resiliency and how much incorrectness you are willing to tolerate.
 
 From my POV, the most important things are:
 
 * 2 nodes must never believe they hold the same lock
 * A node must eventually get the lock
 

If these are musts, then memcache is a no-go for locking. memcached is
likely to delete anything it is storing in its RAM, at any time. Also
if you have several memcache servers, a momentary network blip could
lead to acquiring the lock erroneously.

The only thing it is useful for is coalescing, where a broken lock just
means wasted resources, erroneous errors, etc. If consistency is needed,
then you need a consistent backend.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-16 Thread Jay Pipes

On 06/13/2014 05:01 AM, Julien Danjou wrote:

On Thu, Jun 12 2014, Jay Pipes wrote:


This is news to me. When was this decided and where can I read about
it?


Originally https://wiki.openstack.org/wiki/Oslo/blueprints/service-sync
has been proposed, presented and accepted back at the Icehouse summit in
HKG. That's what led to tooz creation and development since then.


Thanks, Julien, that's a helpful link. Appreciated!

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-16 Thread Joshua Harlow
So this is a reader/write lock then?

I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
kazoo (zookeeper python library) but there was a lack of a maintainer for
that 'recipe', perhaps if we really find this needed we can help get that
pull request 'sponsored' so that it can be used for this purpose?


As far as resiliency, the thing I was thinking about was how correct do u
want this lock to be?

If u say go with memcached and a locking mechanism using it this will not
be correct but it might work good enough under normal usage. So that¹s why
I was wondering about what level of correctness do you want and what do
you want to happen if a server that is maintaining the lock record dies.
In memcaches case this will literally be 1 server, even if sharding is
being used, since a key hashes to one server. So if that one server goes
down (or a network split happens) then it is possible for two entities to
believe they own the same lock (and if the network split recovers this
gets even weirder); so that¹s what I was wondering about when mentioning
resiliency and how much incorrectness you are willing to tolerate.

-Original Message-
From: Matthew Booth mbo...@redhat.com
Organization: Red Hat
Date: Friday, June 13, 2014 at 1:40 AM
To: Joshua Harlow harlo...@yahoo-inc.com, OpenStack Development Mailing
List (not for usage questions) openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [nova] Distributed locking

On 12/06/14 21:38, Joshua Harlow wrote:
 So just a few thoughts before going to far down this path,
 
 Can we make sure we really really understand the use-case where we think
 this is needed. I think it's fine that this use-case exists, but I just
 want to make it very clear to others why its needed and why distributing
 locking is the only *correct* way.

An example use of this would be side-loading an image from another
node's image cache rather than fetching it from glance, which would have
very significant performance benefits in the VMware driver, and possibly
other places. The copier must take a read lock on the image to prevent
the owner from ageing it during the copy. Holding a read lock would also
assure the copier that the image it is copying is complete.

 This helps set a good precedent for others that may follow down this
path
 that they also clearly explain the situation, how distributed locking
 fixes it and all the corner cases that now pop-up with distributed
locking.
 
 Some of the questions that I can think of at the current moment:
 
 * What happens when a node goes down that owns the lock, how does the
 software react to this?

This can be well defined according to the behaviour of the backend. For
example, it is well defined in zookeeper when a node's session expires.
If the lock holder is no longer a valid node, it would be fenced before
deleting its lock, allowing other nodes to continue.

Without fencing it would not be possible to safely continue in this case.

 * What resources are being locked; what is the lock target, what is its
 lifetime?

These are not questions for a locking implementation. A lock would be
held on a name, and it would be up to the api user to ensure that the
protected resource is only used while correctly locked, and that the
lock is not held longer than necessary.

 * What resiliency do you want this lock to provide (this becomes a
 critical question when considering memcached, since memcached is not
 really the best choice for a resilient distributing locking backend)?

What does resiliency mean in this context? We really just need the lock
to be correct

 * What do entities that try to acquire a lock do when they can't acquire
 it?

Typically block, but if a use case emerged for trylock() it would be
simple to implement. For example, in the image side-loading case we may
decide that if it isn't possible to immediately acquire the lock it
isn't worth waiting, and we just fetch it from glance anyway.

 A useful thing I wrote up a while ago, might still be useful:
 
 https://wiki.openstack.org/wiki/StructuredWorkflowLocks
 
 Feel free to move that wiki if u find it useful (its sorta a high-level
 doc on the different strategies and such).

Nice list of implementation pros/cons.

Matt

 
 -Josh
 
 -Original Message-
 From: Matthew Booth mbo...@redhat.com
 Organization: Red Hat
 Reply-To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Date: Thursday, June 12, 2014 at 7:30 AM
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Distributed locking
 
 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware
datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage

Re: [openstack-dev] [nova] Distributed locking

2014-06-15 Thread Clint Byrum
Excerpts from Matthew Booth's message of 2014-06-13 01:40:30 -0700:
 On 12/06/14 21:38, Joshua Harlow wrote:
  So just a few thoughts before going to far down this path,
  
  Can we make sure we really really understand the use-case where we think
  this is needed. I think it's fine that this use-case exists, but I just
  want to make it very clear to others why its needed and why distributing
  locking is the only *correct* way.
 
 An example use of this would be side-loading an image from another
 node's image cache rather than fetching it from glance, which would have
 very significant performance benefits in the VMware driver, and possibly
 other places. The copier must take a read lock on the image to prevent
 the owner from ageing it during the copy. Holding a read lock would also
 assure the copier that the image it is copying is complete.

Really? Usually in the unix-inspired world we just open a file and it
stays around until we close it.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-15 Thread Angus Lees
On Fri, 13 Jun 2014 09:40:30 AM Matthew Booth wrote:
 On 12/06/14 21:38, Joshua Harlow wrote:
  So just a few thoughts before going to far down this path,
  
  Can we make sure we really really understand the use-case where we think
  this is needed. I think it's fine that this use-case exists, but I just
  want to make it very clear to others why its needed and why distributing
  locking is the only *correct* way.
 
 An example use of this would be side-loading an image from another
 node's image cache rather than fetching it from glance, which would have
 very significant performance benefits in the VMware driver, and possibly
 other places. The copier must take a read lock on the image to prevent
 the owner from ageing it during the copy. Holding a read lock would also
 assure the copier that the image it is copying is complete.

For this particular example, taking a lock every time seems expensive.  An 
alternative would be to just try to read from another node, and if the result 
wasn't complete+valid for whatever reason then fallback to reading from 
glance.

  * What happens when a node goes down that owns the lock, how does the
  software react to this?
 
 This can be well defined according to the behaviour of the backend. For
 example, it is well defined in zookeeper when a node's session expires.
 If the lock holder is no longer a valid node, it would be fenced before
 deleting its lock, allowing other nodes to continue.
 
 Without fencing it would not be possible to safely continue in this case.

So I'm sorry for explaining myself poorly in my earlier post.  I think you've 
just described waiting for the lock to expire before another node can take it, 
which is just a regular lock behaviour.  What additional steps do you want 
Fence() to perform at this point?

(I can see if the resource provider had some form of fencing, then it could do 
all sorts of additional things - but I gather your original use case is 
exactly where that *isn't* an option)


If the lock was allowed to go stale and not released cleanly, then we should 
forcibly reboot the stale instance before allowing the lock to be held again 
shouldn't be too hard to add.

- Is this just rebooting the instance sufficient for similar situations or 
would 
we need configurable actions?
- Which bot do we trust to issue the reboot command?

From the locking service pov, I can think of several ways to implement this, 
so we probably want to export a high-level operation and allow the details to 
vary to suit the underlying locking implementation.

-- 
 - Gus

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-14 Thread Robert Collins
On 13 June 2014 02:30, Matthew Booth mbo...@redhat.com wrote:
 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage layer itself,
 so we're looking elsewhere.

Perhaps I'm missing something, but I didn't see anything in your
description about actually needing a *distributed* lock, just needing
a local that can be held by remote systems. As Devananda says, a
centralised lock that can be held by agents has been implemented in
Ironic - such a thing is very simple and quite easy to reason about...
but its not suitable for all problems. HA and consistency requirements
for such a thing are delivered through e.g. galera in the DB layer.

-Rob


-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-14 Thread Joshua Harlow
Are the details of that implementation described on wiki or elsewhere? 
(Partially for my own curiosity). I think I understand how it works but write 
ups usually clear that right up.

Sent from my really tiny device...

 On Jun 14, 2014, at 12:15 AM, Robert Collins robe...@robertcollins.net 
 wrote:
 
 On 13 June 2014 02:30, Matthew Booth mbo...@redhat.com wrote:
 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage layer itself,
 so we're looking elsewhere.
 
 Perhaps I'm missing something, but I didn't see anything in your
 description about actually needing a *distributed* lock, just needing
 a local that can be held by remote systems. As Devananda says, a
 centralised lock that can be held by agents has been implemented in
 Ironic - such a thing is very simple and quite easy to reason about...
 but its not suitable for all problems. HA and consistency requirements
 for such a thing are delivered through e.g. galera in the DB layer.
 
 -Rob
 
 
 -- 
 Robert Collins rbtcoll...@hp.com
 Distinguished Technologist
 HP Converged Cloud
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-13 Thread Matthew Booth
On 12/06/14 21:38, Joshua Harlow wrote:
 So just a few thoughts before going to far down this path,
 
 Can we make sure we really really understand the use-case where we think
 this is needed. I think it's fine that this use-case exists, but I just
 want to make it very clear to others why its needed and why distributing
 locking is the only *correct* way.

An example use of this would be side-loading an image from another
node's image cache rather than fetching it from glance, which would have
very significant performance benefits in the VMware driver, and possibly
other places. The copier must take a read lock on the image to prevent
the owner from ageing it during the copy. Holding a read lock would also
assure the copier that the image it is copying is complete.

 This helps set a good precedent for others that may follow down this path
 that they also clearly explain the situation, how distributed locking
 fixes it and all the corner cases that now pop-up with distributed locking.
 
 Some of the questions that I can think of at the current moment:
 
 * What happens when a node goes down that owns the lock, how does the
 software react to this?

This can be well defined according to the behaviour of the backend. For
example, it is well defined in zookeeper when a node's session expires.
If the lock holder is no longer a valid node, it would be fenced before
deleting its lock, allowing other nodes to continue.

Without fencing it would not be possible to safely continue in this case.

 * What resources are being locked; what is the lock target, what is its
 lifetime?

These are not questions for a locking implementation. A lock would be
held on a name, and it would be up to the api user to ensure that the
protected resource is only used while correctly locked, and that the
lock is not held longer than necessary.

 * What resiliency do you want this lock to provide (this becomes a
 critical question when considering memcached, since memcached is not
 really the best choice for a resilient distributing locking backend)?

What does resiliency mean in this context? We really just need the lock
to be correct

 * What do entities that try to acquire a lock do when they can't acquire
 it?

Typically block, but if a use case emerged for trylock() it would be
simple to implement. For example, in the image side-loading case we may
decide that if it isn't possible to immediately acquire the lock it
isn't worth waiting, and we just fetch it from glance anyway.

 A useful thing I wrote up a while ago, might still be useful:
 
 https://wiki.openstack.org/wiki/StructuredWorkflowLocks
 
 Feel free to move that wiki if u find it useful (its sorta a high-level
 doc on the different strategies and such).

Nice list of implementation pros/cons.

Matt

 
 -Josh
 
 -Original Message-
 From: Matthew Booth mbo...@redhat.com
 Organization: Red Hat
 Reply-To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Date: Thursday, June 12, 2014 at 7:30 AM
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Subject: [openstack-dev] [nova] Distributed locking
 
 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage layer itself,
 so we're looking elsewhere.

 The closest we seem to have in Nova currently are service groups, which
 currently have 3 implementations: DB, Zookeeper and Memcached. The
 service group api currently provides simple membership, but for locking
 we'd be looking for something more.

 I think the api we'd be looking for would be something along the lines of:

 Foo.lock(name, fence_info)
 Foo.unlock(name)

 Bar.fence(fence_info)

 Note that fencing would be required in this case. We believe we can
 fence by terminating the other Nova's vSphere session, but other options
 might include killing a Nova process, or STONITH. These would be
 implemented as fencing drivers.

 Although I haven't worked through the detail, I believe lock and unlock
 would be implementable in all 3 of the current service group drivers.
 Fencing would be implemented separately.

 My questions:

 * Does this already exist, or does anybody have patches pending to do
 something like this?
 * Are there other users for this?
 * Would service groups be an appropriate place, or a new distributed
 locking class?
 * How about if we just used zookeeper directly in the driver?

 Matt

 [1] Cluster ~= hypervisor
 -- 
 Matthew Booth
 Red Hat Engineering, Virtualisation Team

 Phone: +442070094448 (UK)
 GPG ID:  D33C3490
 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

 ___
 OpenStack-dev mailing list
 

Re: [openstack-dev] [nova] Distributed locking

2014-06-12 Thread Julien Danjou
On Thu, Jun 12 2014, Matthew Booth wrote:

 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage layer itself,
 so we're looking elsewhere.

The tooz library has been created for this purpose:

  https://pypi.python.org/pypi/tooz

  https://git.openstack.org/cgit/stackforge/tooz/

 Although I haven't worked through the detail, I believe lock and unlock
 would be implementable in all 3 of the current service group drivers.
 Fencing would be implemented separately.

The plan is to leverage tooz to replace the Nova service group drivers,
as this is also usable in a lot of others OpenStack services.

-- 
Julien Danjou
;; Free Software hacker
;; http://julien.danjou.info


signature.asc
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-12 Thread Matthew Booth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/06/14 15:35, Julien Danjou wrote:
 On Thu, Jun 12 2014, Matthew Booth wrote:
 
 We have a need for a distributed lock in the VMware driver, which
 I suspect isn't unique. Specifically it is possible for a VMware
 datastore to be accessed via multiple nova nodes if it is shared
 between clusters[1]. Unfortunately the vSphere API doesn't
 provide us with the primitives to implement robust locking using
 the storage layer itself, so we're looking elsewhere.
 
 The tooz library has been created for this purpose:
 
 https://pypi.python.org/pypi/tooz
 
 https://git.openstack.org/cgit/stackforge/tooz/
 
 Although I haven't worked through the detail, I believe lock and
 unlock would be implementable in all 3 of the current service
 group drivers. Fencing would be implemented separately.
 
 The plan is to leverage tooz to replace the Nova service group
 drivers, as this is also usable in a lot of others OpenStack
 services.

This looks interesting. It doesn't have hooks for fencing, though.

What's the status of tooz? Would you be interested in adding fencing
hooks?

Matt
- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlOZvc8ACgkQNEHqGdM8NJCgHQCcCTGaZ9520HCa60MJ0xhkD81O
pi4AnA2x9nwGD5F5xD8SHYEYNOpRri/2
=WIsg
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-12 Thread Julien Danjou
On Thu, Jun 12 2014, Matthew Booth wrote:

 This looks interesting. It doesn't have hooks for fencing, though.

 What's the status of tooz? Would you be interested in adding fencing
 hooks?

It's maintained and developer, we have plan to use it in Ceilometer and
others projects. Joshua also wants to use it for Taskflow.

We are blocked for now by https://review.openstack.org/#/c/93443/ and by
the lack of resource to complete that request obviously, so help
appreciated. :)

As for fencing hooks, it sounds like a good idea.

-- 
Julien Danjou
/* Free Software hacker
   http://julien.danjou.info */


signature.asc
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-12 Thread Devananda van der Veen
Ironic has a simple lock mechanism for nodes to ensure that, if the
hash ring rebalances while an operation is in progress, the second
conductor doesn't trample on the work that the first conductor is
doing until it's finished (and releases the lock).

Right now, it's got a simple DB backing. We've discussed making it
more pluggable. I'd be all for there being a common openstack way to
do this, preferably in oslo.

/me adds tooz to my list of things to read

Cheers,
-Deva

On Thu, Jun 12, 2014 at 9:46 AM, Jay Pipes jaypi...@gmail.com wrote:
 On 06/12/2014 10:35 AM, Julien Danjou wrote:

 On Thu, Jun 12 2014, Matthew Booth wrote:

 We have a need for a distributed lock in the VMware driver, which I
 suspect isn't unique. Specifically it is possible for a VMware datastore
 to be accessed via multiple nova nodes if it is shared between
 clusters[1]. Unfortunately the vSphere API doesn't provide us with the
 primitives to implement robust locking using the storage layer itself,
 so we're looking elsewhere.


 The tooz library has been created for this purpose:

https://pypi.python.org/pypi/tooz

https://git.openstack.org/cgit/stackforge/tooz/

 Although I haven't worked through the detail, I believe lock and unlock
 would be implementable in all 3 of the current service group drivers.
 Fencing would be implemented separately.


 The plan is to leverage tooz to replace the Nova service group drivers,
 as this is also usable in a lot of others OpenStack services.


 This is news to me. When was this decided and where can I read about it?

 Thanks,
 -jay



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Distributed locking

2014-06-12 Thread Angus Lees
On Thu, 12 Jun 2014 05:06:38 PM Julien Danjou wrote:
 On Thu, Jun 12 2014, Matthew Booth wrote:
  This looks interesting. It doesn't have hooks for fencing, though.
  
  What's the status of tooz? Would you be interested in adding fencing
  hooks?
 
 It's maintained and developer, we have plan to use it in Ceilometer and
 others projects. Joshua also wants to use it for Taskflow.
 
 We are blocked for now by https://review.openstack.org/#/c/93443/ and by
 the lack of resource to complete that request obviously, so help
 appreciated. :)
 
 As for fencing hooks, it sounds like a good idea.

As far as I understand these things, in distributed-locking-speak fencing 
just means breaking someone else's lock.

I think your options here are (and apologies if I'm repeating things that are 
obvious):

1. Have a force unlock protocol (numerous alternatives exist).  Assume the 
lock holder implements it properly and stops accessing the shared resource 
when asked.

2. Kill the lock holder using some method unrelated to the locking service and 
wait for the locking protocol to realise ex-holder is dead through usual 
liveness tests.  Assume not being able to hold the lock implies no longer able 
to access the shared resource.
The liveness test usually involves the holder pinging the lock service 
periodically, and everyone has to wait for some agreed timeout before assuming 
a client is dead.

(1) involves a lot of trust - and seems particularly bad if the reason you are 
breaking the lock is because the holder is misbehaving.
Assuming (2) is the only reasonable choice, I don't think the lock service 
needs explicit support for fencing, since the exact method for killing the 
holder is unrelated, and relatively uninteresting (probably always going to be 
an instance delete in OS).


Perhaps more interesting is exactly what conditions you require before 
attempting to kill the lock holder - you wouldn't want just any job deciding 
it was warranted, or else a misbehaving client would cause mayhem.  Again, I 
suggest your options here are:

1. Require human judgement.
ie: provide monitoring for whatever is misbehaving and make it obvious that 
one mitigation is to nuke the apparent holder.

2. Require the lock breaker to be able to reach a majority of nodes as some 
proof of I'm working, my opinion must be right.
In a paxos system, reaching a majority of nodes basically becomes hold a 
lock, we end back up with my liveness test is better than yours somehow, 
and I'm not sure how to resolve that without human judgement (but I'm not 
familiar with existing approaches).  Again, I don't think this needs 
additional support from the lock service, beyond a liveness test (which 
zookeeper, for example, has).

tl;dr: I'm interested in what sort of automated fencing behaviour you'd like.

-- 
 - Gus

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev