On 10.06.2012 23:39, Jeff Janes wrote:
The attached patch fixes that by remembering up to 10 local locks, and
pushing them up specifically rather than digging through the entire
lock table. If the list overflows, then it reverts to the old
behavior of digging through the entire local lock table.
I'm a bit uncomfortable with having a fixed limit like that, after which
you fall off the cliff. But I guess we can live with that, since we've
had the quadratic behavior for years, and haven't heard complaints about
it until recently. We can devise some more complex scheme later, if
people still run into this.
One idea for a more complex scheme, should we need one in the future, is
to make the cache in the ResourceOwner expandable, like all the other
arrays in ResourceOwner. It would not overflow when new entries are
added it. However, if you try to forget a lock, and it's not found in
the bottom 10 entries or so of the cache, mark the cache as overflowed
at that point. That way reassigning the locks would be fast, even if the
current resource owner holds a lot of locks, as longs as you haven't
tried to release any of them out of LIFO order.
When it needs to forget a lock, it searches backwards in the list of
released lock and then just moves the last lock into the place of the
one to be forgotten. Other ResourceOwner Forget functions slide the
entire list down to close the gap, rather than using the
selection-sort-like method. I don't understand why they do that. If
Forgets are strictly LIFO, then it would not make a difference. If
they are mostly LIFO but occasionally not, maybe the insertion method
would win over the selection method.
Yeah, I think that's the reason. We try to keep the arrays in LIFO
order. If you move the last entry into the removed slot, then even a
single out-of-order removal will spoil the heuristic that removals are
done in LIFO order. Imagine that you insert 5 entries in order:
1
2
3
4
5
Now imagine that you first remove 1 (out-of-order), and then 5, 4, 3,
and 2 (in order). The removal of 1 moves 5 to the beginning of the
array. Then you remove 5, and that has to traverse the whole list
because 5 is now at the beginning. Then you swap 4 into the beginning of
the list, and so forth. Every subsequent removal scans the list from
bottom to top, because of the single out-of-order removal.
From what I can tell, Locks are
mostly released in bulk anyway at transaction end, and rarely released
explicitly.
Hmm, if they are released in bulk, then it doesn't matter either way. So
the decision on whether to do the slide or swap has to be made on the
assumption that some locks are released out-of-order. With an array of
10-15 entries, it probably doesn't make any difference either way, though.
For degrading performance in other cases, I think the best test case
is "pgbench -P" (implemented in another patch in this commitfest)
which has a loop which pushes one or two locks up from a portal to the
parent (which already owns them, due to previous rounds of the same
loop) very frequently. There might be a performance degradation of
0.5% or so, but it is less than the run to run variation. I plan to
run some longer test to get a better estimate. If there is a
degradation in that range, how important is that?
I think that would be acceptable.
I found the interface between resowner.c and lock.c a bit confusing.
resowner.c would sometimes call LockReassignCurrentOwner() to reassign
all the locks, and sometimes it would call LockReassignCurrentOwner() on
each individual lock, with the net effect that's the same as calling
LockReleaseCurrentOwner(). And it doesn't seem right for
ResourceOwnerRemember/ForgetLock to have to accept a NULL owner.
I rearranged that so that there's just a single
LockReassignCurrentOwner() function, like before this patch. But it
takes as an optional argument a list of locks held by the current
resource owner. If the caller knows it, it can pass them to make the
call faster, but if it doesn't it can just pass NULL and the function
will traverse the hash table the old-fashioned way. I think that's a
better API.
Please take a look to see if I broke something.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 9717075..9dd9c33 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -345,6 +345,7 @@ static void BeginStrongLockAcquire(LOCALLOCK *locallock, uint32 fasthashcode);
static void FinishStrongLockAcquire(void);
static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner);
static void ReleaseLockIfHeld(LOCALLOCK *locallock, bool sessionLock);
+static void LockReassignOwner(LOCALLOCK *locallock, ResourceOwner parent);
static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode,
PROCLOCK *proclock, LockMethod lockMethodTable);
static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
@@ -1098,8 +1099,16 @@ SetupLockInTable(LockMethod lockMethodTable, PGPROC *proc,
static void
RemoveLocalLock(LOCALLOCK *locallock)
{
+ int i;
+
+ for (i = locallock->numLockOwners - 1; i >= 0; i--)
+ {
+ if (locallock->lockOwners[i].owner != NULL)
+ ResourceOwnerForgetLock(locallock->lockOwners[i].owner, locallock);
+ }
pfree(locallock->lockOwners);
locallock->lockOwners = NULL;
+
if (locallock->holdsStrongLockCount)
{
uint32 fasthashcode;
@@ -1111,6 +1120,7 @@ RemoveLocalLock(LOCALLOCK *locallock)
locallock->holdsStrongLockCount = FALSE;
SpinLockRelease(&FastPathStrongRelationLocks->mutex);
}
+
if (!hash_search(LockMethodLocalHash,
(void *) &(locallock->tag),
HASH_REMOVE, NULL))
@@ -1354,6 +1364,8 @@ GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner)
lockOwners[i].owner = owner;
lockOwners[i].nLocks = 1;
locallock->numLockOwners++;
+ if (owner != NULL)
+ ResourceOwnerRememberLock(owner, locallock);
}
/*
@@ -1669,6 +1681,8 @@ LockRelease(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
Assert(lockOwners[i].nLocks > 0);
if (--lockOwners[i].nLocks == 0)
{
+ if (owner != NULL)
+ ResourceOwnerForgetLock(owner, locallock);
/* compact out unused slot */
locallock->numLockOwners--;
if (i < locallock->numLockOwners)
@@ -1861,14 +1875,13 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
{
LOCALLOCKOWNER *lockOwners = locallock->lockOwners;
- /* If it's above array position 0, move it down to 0 */
- for (i = locallock->numLockOwners - 1; i > 0; i--)
+ /* If session lock is above array position 0, move it down to 0 */
+ for (i = 0; i < locallock->numLockOwners ; i++)
{
if (lockOwners[i].owner == NULL)
- {
lockOwners[0] = lockOwners[i];
- break;
- }
+ else
+ ResourceOwnerForgetLock(lockOwners[i].owner, locallock);
}
if (locallock->numLockOwners > 0 &&
@@ -1881,6 +1894,8 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
/* We aren't deleting this locallock, so done */
continue;
}
+ else
+ locallock->numLockOwners = 0;
}
/*
@@ -2066,18 +2081,31 @@ LockReleaseSession(LOCKMETHODID lockmethodid)
/*
* LockReleaseCurrentOwner
* Release all locks belonging to CurrentResourceOwner
+ *
+ * If the caller knows what those locks are, it can pass them as an array.
+ * That speeds up the call significantly, when a lot of locks are held.
+ * Otherwise, pass NULL for locallocks, and we'll traverse through our hash
+ * table to find them.
*/
void
-LockReleaseCurrentOwner(void)
+LockReleaseCurrentOwner(LOCALLOCK **locallocks, int nlocks)
{
- HASH_SEQ_STATUS status;
- LOCALLOCK *locallock;
+ if (locallocks == NULL)
+ {
+ HASH_SEQ_STATUS status;
+ LOCALLOCK *locallock;
- hash_seq_init(&status, LockMethodLocalHash);
+ hash_seq_init(&status, LockMethodLocalHash);
- while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ ReleaseLockIfHeld(locallock, false);
+ }
+ else
{
- ReleaseLockIfHeld(locallock, false);
+ int i;
+
+ for (i = nlocks - 1; i >= 0; i--)
+ ReleaseLockIfHeld(locallocks[i], false);
}
}
@@ -2123,6 +2151,8 @@ ReleaseLockIfHeld(LOCALLOCK *locallock, bool sessionLock)
locallock->nLocks -= lockOwners[i].nLocks;
/* compact out unused slot */
locallock->numLockOwners--;
+ if (owner != NULL)
+ ResourceOwnerForgetLock(owner, locallock);
if (i < locallock->numLockOwners)
lockOwners[i] = lockOwners[locallock->numLockOwners];
}
@@ -2145,57 +2175,83 @@ ReleaseLockIfHeld(LOCALLOCK *locallock, bool sessionLock)
/*
* LockReassignCurrentOwner
* Reassign all locks belonging to CurrentResourceOwner to belong
- * to its parent resource owner
+ * to its parent resource owner.
+ *
+ * If the caller knows what those locks are, it can pass them as an array.
+ * That speeds up the call significantly, when a lot of locks are held
+ * (e.g pg_dump with a large schema). Otherwise, pass NULL for locallocks,
+ * and we'll traverse through our hash table to find them.
*/
void
-LockReassignCurrentOwner(void)
+LockReassignCurrentOwner(LOCALLOCK **locallocks, int nlocks)
{
ResourceOwner parent = ResourceOwnerGetParent(CurrentResourceOwner);
- HASH_SEQ_STATUS status;
- LOCALLOCK *locallock;
- LOCALLOCKOWNER *lockOwners;
Assert(parent != NULL);
- hash_seq_init(&status, LockMethodLocalHash);
+ if (locallocks == NULL)
+ {
+ HASH_SEQ_STATUS status;
+ LOCALLOCK *locallock;
- while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ hash_seq_init(&status, LockMethodLocalHash);
+
+ while ((locallock = (LOCALLOCK *) hash_seq_search(&status)) != NULL)
+ LockReassignOwner(locallock, parent);
+ }
+ else
{
- int i;
- int ic = -1;
- int ip = -1;
+ int i;
- /*
- * Scan to see if there are any locks belonging to current owner or
- * its parent
- */
- lockOwners = locallock->lockOwners;
- for (i = locallock->numLockOwners - 1; i >= 0; i--)
- {
- if (lockOwners[i].owner == CurrentResourceOwner)
- ic = i;
- else if (lockOwners[i].owner == parent)
- ip = i;
- }
+ for (i = nlocks - 1; i >= 0; i--)
+ LockReassignOwner(locallocks[i], parent);
+ }
+}
- if (ic < 0)
- continue; /* no current locks */
+/*
+ * Subroutine of LockReassignCurrentOwner. Reassigns a given lock belonging to
+ * CurrentResourceOwner to its parent.
+ */
+static void
+LockReassignOwner(LOCALLOCK *locallock, ResourceOwner parent)
+{
+ LOCALLOCKOWNER *lockOwners;
+ int i;
+ int ic = -1;
+ int ip = -1;
- if (ip < 0)
- {
- /* Parent has no slot, so just give it child's slot */
- lockOwners[ic].owner = parent;
- }
- else
- {
- /* Merge child's count with parent's */
- lockOwners[ip].nLocks += lockOwners[ic].nLocks;
- /* compact out unused slot */
- locallock->numLockOwners--;
- if (ic < locallock->numLockOwners)
- lockOwners[ic] = lockOwners[locallock->numLockOwners];
- }
+ /*
+ * Scan to see if there are any locks belonging to current owner or
+ * its parent
+ */
+ lockOwners = locallock->lockOwners;
+ for (i = locallock->numLockOwners - 1; i >= 0; i--)
+ {
+ if (lockOwners[i].owner == CurrentResourceOwner)
+ ic = i;
+ else if (lockOwners[i].owner == parent)
+ ip = i;
+ }
+
+ if (ic < 0)
+ return; /* no current locks */
+
+ if (ip < 0)
+ {
+ /* Parent has no slot, so just give it the child's slot */
+ lockOwners[ic].owner = parent;
+ ResourceOwnerRememberLock(parent, locallock);
+ }
+ else
+ {
+ /* Merge child's count with parent's */
+ lockOwners[ip].nLocks += lockOwners[ic].nLocks;
+ /* compact out unused slot */
+ locallock->numLockOwners--;
+ if (ic < locallock->numLockOwners)
+ lockOwners[ic] = lockOwners[locallock->numLockOwners];
}
+ ResourceOwnerForgetLock(CurrentResourceOwner, locallock);
}
/*
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 7c771eb..302da7a 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -27,6 +27,23 @@
#include "utils/rel.h"
#include "utils/snapmgr.h"
+/*
+ * To speed up bulk releasing or reassigning locks from a resource owner to
+ * its parent, each resource owner has a small cache of locks it owns. The
+ * lock manager has the same information in its local lock hash table, and
+ * we fall back on that if cache overflows, but traversing the hash table
+ * is slower when there are a lot of locks belonging to other resource owners.
+ *
+ * MAX_RESOWNER_LOCKS is the size of the per-resource owner cache. It's
+ * chosen based on some testing with pg_dump with a large schema. When the
+ * tests were done (on 9.2), resource owners in a pg_dump run contained up
+ * to 9 locks, regardless of the schema size, except for the top resource
+ * owner which contained much more (overflowing the array). 15 seems like a
+ * nice round number that's somewhat higher than what pg_dump needs. Note that
+ * making this number larger is not free - the bigger the array, the slower
+ * it is to release locks (in retail), when a resource owner holds many locks.
+ */
+#define MAX_RESOWNER_LOCKS 15
/*
* ResourceOwner objects look like this
@@ -43,6 +60,10 @@ typedef struct ResourceOwnerData
Buffer *buffers; /* dynamically allocated array */
int maxbuffers; /* currently allocated array size */
+ /* We can remember up to MAX_RESOWNER_LOCKS references to local locks. */
+ int nlocks; /* number of owned locks */
+ LOCALLOCK *locks[MAX_RESOWNER_LOCKS]; /* list of owned locks */
+
/* We have built-in support for remembering catcache references */
int ncatrefs; /* number of owned catcache pins */
HeapTuple *catrefs; /* dynamically allocated array */
@@ -272,11 +293,30 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
* subtransaction, we do NOT release its locks yet, but transfer
* them to the parent.
*/
+ LOCALLOCK **locks;
+ int nlocks;
+
Assert(owner->parent != NULL);
+
+ /*
+ * Pass the list of locks owned by this resource owner to the lock
+ * manager, unless it has overflowed.
+ */
+ if (owner->nlocks > MAX_RESOWNER_LOCKS)
+ {
+ locks = NULL;
+ nlocks = 0;
+ }
+ else
+ {
+ locks = owner->locks;
+ nlocks = owner->nlocks;
+ }
+
if (isCommit)
- LockReassignCurrentOwner();
+ LockReassignCurrentOwner(locks, nlocks);
else
- LockReleaseCurrentOwner();
+ LockReleaseCurrentOwner(locks, nlocks);
}
}
else if (phase == RESOURCE_RELEASE_AFTER_LOCKS)
@@ -357,6 +397,7 @@ ResourceOwnerDelete(ResourceOwner owner)
/* And it better not own any resources, either */
Assert(owner->nbuffers == 0);
+ Assert(owner->nlocks == 0 || owner->nlocks == MAX_RESOWNER_LOCKS + 1);
Assert(owner->ncatrefs == 0);
Assert(owner->ncatlistrefs == 0);
Assert(owner->nrelrefs == 0);
@@ -589,6 +630,56 @@ ResourceOwnerForgetBuffer(ResourceOwner owner, Buffer buffer)
}
/*
+ * Remember that a Local Lock is owned by a ResourceOwner
+ *
+ * This is different from the other Remember functions in that the list of
+ * locks is only a lossy cache. It can hold up to MAX_RESOWNER_LOCKS entries,
+ * and when it overflows, we stop tracking locks. The point of only remembering
+ * only up to MAX_RESOWNER_LOCKS entries is that if a lot of locks are held,
+ * ResourceOwnerForgetLock doesn't need to scan through a large array to find
+ * the entry.
+ */
+void
+ResourceOwnerRememberLock(ResourceOwner owner, LOCALLOCK * locallock)
+{
+ if (owner->nlocks > MAX_RESOWNER_LOCKS)
+ return; /* we have already overflowed */
+
+ if (owner->nlocks < MAX_RESOWNER_LOCKS)
+ owner->locks[owner->nlocks] = locallock;
+ else
+ {
+ /* overflowed */
+ }
+ owner->nlocks++;
+}
+
+/*
+ * Forget that a Local Lock is owned by a ResourceOwner
+ */
+void
+ResourceOwnerForgetLock(ResourceOwner owner, LOCALLOCK *locallock)
+{
+ int i;
+
+ if (owner->nlocks > MAX_RESOWNER_LOCKS)
+ return; /* we have overflowed */
+
+ Assert(owner->nlocks > 0);
+ for (i = owner->nlocks - 1; i >= 0; i--)
+ {
+ if (locallock == owner->locks[i])
+ {
+ owner->locks[i] = owner->locks[owner->nlocks - 1];
+ owner->nlocks--;
+ return;
+ }
+ }
+ elog(PANIC, "lock reference %p is not owned by resource owner %s",
+ locallock, owner->name);
+}
+
+/*
* Make sure there is room for at least one more entry in a ResourceOwner's
* catcache reference array.
*
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 17b8942..1951854 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -492,8 +492,8 @@ extern bool LockRelease(const LOCKTAG *locktag,
LOCKMODE lockmode, bool sessionLock);
extern void LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks);
extern void LockReleaseSession(LOCKMETHODID lockmethodid);
-extern void LockReleaseCurrentOwner(void);
-extern void LockReassignCurrentOwner(void);
+extern void LockReleaseCurrentOwner(LOCALLOCK **locallocks, int nlocks);
+extern void LockReassignCurrentOwner(LOCALLOCK **locallocks, int nlocks);
extern VirtualTransactionId *GetLockConflicts(const LOCKTAG *locktag,
LOCKMODE lockmode);
extern void AtPrepare_Locks(void);
diff --git a/src/include/utils/resowner.h b/src/include/utils/resowner.h
index 11034e4..399ea2b 100644
--- a/src/include/utils/resowner.h
+++ b/src/include/utils/resowner.h
@@ -20,6 +20,7 @@
#define RESOWNER_H
#include "storage/fd.h"
+#include "storage/lock.h"
#include "utils/catcache.h"
#include "utils/plancache.h"
#include "utils/snapshot.h"
@@ -89,6 +90,11 @@ extern void ResourceOwnerEnlargeBuffers(ResourceOwner owner);
extern void ResourceOwnerRememberBuffer(ResourceOwner owner, Buffer buffer);
extern void ResourceOwnerForgetBuffer(ResourceOwner owner, Buffer buffer);
+
+/* support for local lock management */
+extern void ResourceOwnerRememberLock(ResourceOwner owner, LOCALLOCK *locallock);
+extern void ResourceOwnerForgetLock(ResourceOwner owner, LOCALLOCK *locallock);
+
/* support for catcache refcount management */
extern void ResourceOwnerEnlargeCatCacheRefs(ResourceOwner owner);
extern void ResourceOwnerRememberCatCacheRef(ResourceOwner owner,
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers