Hi, * Thomas Thrainer <[email protected]> [2013-11-29 10:40:59 +0100]:
> LGTM, thanks! > But as I was in involved quite a bit in the bug / discussion, maybe it > might make sense that somebody else takes a look too... > I think this patch does not actually fixes the issue 621 (explained in #3). The reported corner case was when LUInstanceRemove() and LUNetworkDisconnect() were running concurently. A workaround that we have tried is upon LUNetworkDisconnect() and LUNetworkConnect() try to acquire all cluster's instances in shared mode. By that _LS_ACQUIRE_ALL mode is set and not _LS_ACQUIRE_EXACT and thus the deleted lock does cause any problem. Still I don't think this is the proper way to go. Maybe exporting the locking mode up to the LU level just like oportunistic/share locking option (as chris suggested during an internal discussion) is a rational decision. What do you think? Besides that thanks for the patch, dimara > Thomas > > > On Thu, Nov 28, 2013 at 3:48 PM, Petr Pudlak <[email protected]> wrote: > > > This is required to prevent race conditions such as removing a network > > from a group and adding an instance at the same time. (See issue 621#2.) > > > > Signed-off-by: Petr Pudlak <[email protected]> > > --- > > lib/cmdlib/instance.py | 34 ++++++++++++++++++++++++++++++++++ > > 1 file changed, 34 insertions(+) > > > > diff --git a/lib/cmdlib/instance.py b/lib/cmdlib/instance.py > > index 12340e8..5d58544 100644 > > --- a/lib/cmdlib/instance.py > > +++ b/lib/cmdlib/instance.py > > @@ -559,6 +559,22 @@ class LUInstanceCreate(LogicalUnit): > > self.needed_locks[locking.LEVEL_NODE_RES] = \ > > CopyLockList(self.needed_locks[locking.LEVEL_NODE]) > > > > + # Optimistically acquire shared group locks (we're reading the > > + # configuration). We can't just call GetInstanceNodeGroups, because > > the > > + # instance doesn't exist yet. Therefore we lock all node groups of all > > + # nodes we have. > > + if self.needed_locks[locking.LEVEL_NODE] == locking.ALL_SET: > > + # In the case we lock all nodes for opportunistic allocation, we > > have no > > + # choice than to lock all groups, because they're allocated before > > nodes. > > + # This is sad, but true. At least we release all those we don't > > need in > > + # CheckPrereq later. > > + self.needed_locks[locking.LEVEL_NODEGROUP] = locking.ALL_SET > > + else: > > + self.needed_locks[locking.LEVEL_NODEGROUP] = \ > > + list(self.cfg.GetNodeGroupsFromNodes( > > + self.needed_locks[locking.LEVEL_NODE])) > > + self.share_locks[locking.LEVEL_NODEGROUP] = 1 > > + > > def DeclareLocks(self, level): > > if level == locking.LEVEL_NODE_RES and \ > > self.opportunistic_locks[locking.LEVEL_NODE]: > > @@ -846,6 +862,21 @@ class LUInstanceCreate(LogicalUnit): > > """Check prerequisites. > > > > """ > > + # Check that the optimistically acquired groups are correct wrt the > > + # acquired nodes > > + owned_groups = frozenset(self.owned_locks(locking.LEVEL_NODEGROUP)) > > + owned_nodes = frozenset(self.owned_locks(locking.LEVEL_NODE)) > > + cur_groups = list(self.cfg.GetNodeGroupsFromNodes(owned_nodes)) > > + if not owned_groups.issuperset(cur_groups): > > + raise errors.OpPrereqError("New instance %s's node groups changed > > since" > > + " locks were acquired, current groups > > are" > > + " are '%s', owning groups '%s'; retry > > the" > > + " operation" % > > + (self.op.instance_name, > > + utils.CommaJoin(cur_groups), > > + utils.CommaJoin(owned_groups)), > > + errors.ECODE_STATE) > > + > > self._CalculateFileStorageDir() > > > > if self.op.mode == constants.INSTANCE_IMPORT: > > @@ -957,6 +988,9 @@ class LUInstanceCreate(LogicalUnit): > > ReleaseLocks(self, locking.LEVEL_NODE, keep=keep_locks) > > ReleaseLocks(self, locking.LEVEL_NODE_RES, keep=keep_locks) > > ReleaseLocks(self, locking.LEVEL_NODE_ALLOC) > > + # Release all unneeded group locks > > + ReleaseLocks(self, locking.LEVEL_NODEGROUP, > > + keep=self.cfg.GetNodeGroupsFromNodes(keep_locks)) > > > > assert (self.owned_locks(locking.LEVEL_NODE) == > > self.owned_locks(locking.LEVEL_NODE_RES)), \ > > -- > > 1.8.4.1 > > > > > > > -- > Thomas Thrainer | Software Engineer | [email protected] | > > Google Germany GmbH > Dienerstr. 12 > 80331 München > > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > Geschäftsführer: Graham Law, Christine Elizabeth Flores
signature.asc
Description: Digital signature
