On Wed, Oct 01, 2014 at 01:21:41PM -0400, Bob Peterson wrote: > Hi, > > This patch adds a new lock flag, DLM_LKF_NOLOOKUP, which instructs DLM > to refrain from sending lookup requests in cases where the lock library > node is not the current node. This is similar to the DLM_LKF_NOQUEUE > flag, except it fails locks that would require a lookup, with -EAGAIN.
Can we just use NOQUEUE? It tells you that there's a lock conflict, which tells you to move along and try another if you don't want to contend. If you cache acquired locks and reuse them, then it doesn't matter if the master node is remote or local. If lookups are a problem in general, there is the "nodir" lockspace mode, which replaces the resource directory lookup system with a static mapping of resources to master nodes. > This is not just about saving a network operation. It allows callers > like GFS2 to master locks for which they are the directory node. Each > node can then "prefer" local locks, especially in the case of GFS2 > selecting resource groups for block allocations (implemented with a > separate patch). This mastering of local locks distributes the locks > between the nodes (at least until nodes enter or leave the cluster), > which tends to make each node "keep to itself" when doing allocations. > Thus, dlm communications are kept to a minimum, which results in > significantly faster block allocations. Back in 2002 I solved what sounds like the same problem in gfs(1). It allowed all nodes to allocate blocks independent of each other, without constant locking. You can see the solution here: https://git.fedorahosted.org/cgit/cluster.git/tree/gfs-kernel/src/gfs/rgrp.c?h=RHEL4