On 10/13/2014 06:26 PM, Andres Freund wrote:
On 2014-10-13 17:56:10 +0300, Heikki Linnakangas wrote:
So the gist of the problem is that LWLockRelease doesn't wake up
LW_WAIT_UNTIL_FREE waiters, when releaseOK == false. It should, because a
LW_WAIT_UNTIL FREE waiter is now free to run if the variable has changed in
value, and it won't steal the lock from the other backend that's waiting to
get the lock in exclusive mode, anyway.

I'm not a big fan of that change. Right now we don't iterate the waiters
if releaseOK isn't set. Which is good for the normal lwlock code because
it avoids pointer indirections (of stuff likely residing on another
cpu).

I can't get excited about that. It's pretty rare for releaseOK to be false, and when it's true, you iterate the waiters anyway.

Wouldn't it be more sensible to reset releaseOK in *UpdateVar()? I
might just miss something here.

That's not enough. There's no LWLockUpdateVar involved in the example I gave. And LWLockUpdateVar() already wakes up all LW_WAIT_UNTIL_FREE waiters, regardless of releaseOK.

Hmm, we could set releaseOK in LWLockWaitForVar(), though, when it (re-)queues the backend. That would be less invasive, for sure (attached). I like this better.

BTW, attached is a little test program I wrote to reproduce this more easily. It exercises the LWLock* calls directly. To run, make and install, and do "CREATE EXTENSION lwlocktest". Then launch three backends concurrently that run "select lwlocktest(1)", "select lwlocktest(2)" and "select lwlocktest(3)". It will deadlock within seconds.

- Heikki

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 5453549..cee3f08 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -482,6 +482,7 @@ static inline bool
 LWLockAcquireCommon(LWLock *l, LWLockMode mode, uint64 *valptr, uint64 val)
 {
 	volatile LWLock *lock = l;
+	volatile uint64 *valp = valptr;
 	PGPROC	   *proc = MyProc;
 	bool		retry = false;
 	bool		result = true;
@@ -637,8 +638,8 @@ LWLockAcquireCommon(LWLock *l, LWLockMode mode, uint64 *valptr, uint64 val)
 	}
 
 	/* If there's a variable associated with this lock, initialize it */
-	if (valptr)
-		*valptr = val;
+	if (valp)
+		*valp = val;
 
 	/* We are done updating shared state of the lock itself. */
 	SpinLockRelease(&lock->mutex);
@@ -976,6 +977,12 @@ LWLockWaitForVar(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
 			lock->tail = proc;
 		lock->head = proc;
 
+		/*
+		 * Set releaseOK, to make sure we get woken up as soon as the lock is
+		 * released.
+		 */
+		lock->releaseOK = true;
+
 		/* Can release the mutex now */
 		SpinLockRelease(&lock->mutex);
 

Attachment: lwlocktest.tar.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to