So just a few thoughts before going to far down this path, Can we make sure we really really understand the use-case where we think this is needed. I think it's fine that this use-case exists, but I just want to make it very clear to others why its needed and why distributing locking is the only *correct* way.
This helps set a good precedent for others that may follow down this path that they also clearly explain the situation, how distributed locking fixes it and all the corner cases that now pop-up with distributed locking. Some of the questions that I can think of at the current moment: * What happens when a node goes down that owns the lock, how does the software react to this? * What resources are being locked; what is the lock target, what is its lifetime? * What resiliency do you want this lock to provide (this becomes a critical question when considering memcached, since memcached is not really the best choice for a resilient distributing locking backend)? * What do entities that try to acquire a lock do when they can't acquire it? A useful thing I wrote up a while ago, might still be useful: https://wiki.openstack.org/wiki/StructuredWorkflowLocks Feel free to move that wiki if u find it useful (its sorta a high-level doc on the different strategies and such). -Josh -----Original Message----- From: Matthew Booth <mbo...@redhat.com> Organization: Red Hat Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev@lists.openstack.org> Date: Thursday, June 12, 2014 at 7:30 AM To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev@lists.openstack.org> Subject: [openstack-dev] [nova] Distributed locking >We have a need for a distributed lock in the VMware driver, which I >suspect isn't unique. Specifically it is possible for a VMware datastore >to be accessed via multiple nova nodes if it is shared between >clusters[1]. Unfortunately the vSphere API doesn't provide us with the >primitives to implement robust locking using the storage layer itself, >so we're looking elsewhere. > >The closest we seem to have in Nova currently are service groups, which >currently have 3 implementations: DB, Zookeeper and Memcached. The >service group api currently provides simple membership, but for locking >we'd be looking for something more. > >I think the api we'd be looking for would be something along the lines of: > >Foo.lock(name, fence_info) >Foo.unlock(name) > >Bar.fence(fence_info) > >Note that fencing would be required in this case. We believe we can >fence by terminating the other Nova's vSphere session, but other options >might include killing a Nova process, or STONITH. These would be >implemented as fencing drivers. > >Although I haven't worked through the detail, I believe lock and unlock >would be implementable in all 3 of the current service group drivers. >Fencing would be implemented separately. > >My questions: > >* Does this already exist, or does anybody have patches pending to do >something like this? >* Are there other users for this? >* Would service groups be an appropriate place, or a new distributed >locking class? >* How about if we just used zookeeper directly in the driver? > >Matt > >[1] Cluster ~= hypervisor >-- >Matthew Booth >Red Hat Engineering, Virtualisation Team > >Phone: +442070094448 (UK) >GPG ID: D33C3490 >GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 > >_______________________________________________ >OpenStack-dev mailing list >OpenStack-dev@lists.openstack.org >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev