Andres Freund <and...@2ndquadrant.com> wrote:

> I don't think it's actually 675333 at fault here. I think it's a
> long standing bug in LockBufferForCleanup() that can just much
> easier be hit with the new interrupt code.

The patches I'll be posting soon make it even easier to hit, which
is why I was trying to sort this out when Tom noticed the buildfarm
issues.

> Imagine what happens in LockBufferForCleanup() when
> ProcWaitForSignal() returns spuriously - something it's
> documented to possibly do (and which got more likely with the new
> patches). In the normal case UnpinBuffer() will have unset
> BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set
> and LockBufferForCleanup() will see it still set.

That analysis makes sense to me.

> I think we should simply move the
>   buf->flags &= ~BM_PIN_COUNT_WAITER (Inside LockBuffer)

I think you meant inside UnpinBuffer?

> to LockBufferForCleanup, besides the PinCountWaitBuf = NULL.
> Afaics, that should do the trick.

I tried that on the master branch (33e879c) (attached) and it
passes `make check-world` with no problems.  I'm reviewing the
places that BM_PIN_COUNT_WAITER appears, to see if I can spot any
flaw in this.  Does anyone else see a problem with it?  Even though
it appears to be a long-standing bug, there don't appear to have
been any field reports, so it doesn't seem like something to
back-patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e1e6240..40b2194 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1548,7 +1548,6 @@ UnpinBuffer(volatile BufferDesc *buf, bool fixOwner)
 			/* we just released the last pin other than the waiter's */
 			int			wait_backend_pid = buf->wait_backend_pid;
 
-			buf->flags &= ~BM_PIN_COUNT_WAITER;
 			UnlockBufHdr(buf);
 			ProcSendSignal(wait_backend_pid);
 		}
@@ -3273,6 +3272,7 @@ LockBufferForCleanup(Buffer buffer)
 		else
 			ProcWaitForSignal();
 
+		bufHdr->flags &= ~BM_PIN_COUNT_WAITER;
 		PinCountWaitBuf = NULL;
 		/* Loop back and try again */
 	}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to