Andres Freund <and...@2ndquadrant.com> writes: > On 2015-02-14 14:10:53 -0500, Tom Lane wrote: >> Andres Freund <and...@2ndquadrant.com> writes: >>> If you just gdb into the VACUUM process with 6647248e370884 checked out, >>> and do a PGSemaphoreUnlock(&MyProc->sem) you'll hit it as well. I think >>> we should simply move the buf->flags &= ~BM_PIN_COUNT_WAITER (Inside >>> LockBuffer) to LockBufferForCleanup, besides the PinCountWaitBuf = >>> NULL. Afaics, that should do the trick.
>> If we're moving the responsibility for clearing that flag from the waker >> to the wakee, > I'm not sure if that's the best plan. Some buffers are pinned at an > incredible rate, sending a signal everytime might actually delay the > pincount waiter from actually getting through the loop. Hm, good point. On the other hand, should we worry about the possibility of a lost signal? Moving the flag-clearing would guard against that, which the current code does not. But we've not seen field reports of such issues AFAIR, so this might not be an important consideration. > Unless we block > further buffer pins by any backend while the flag is set, which I think > would likely not be a good idea, there seem to be little benefit in > moving the responsibility. I concur that we don't want the flag to block other backends from acquiring pins. The whole point here is for VACUUM to lurk in the background until it can proceed with deletion; we don't want it to take priority over foreground queries. > I think just adding something like > ... > /* > * Make sure waiter flag is reset - it might not be if > * ProcWaitForSignal() returned for another reason than UnpinBuffer() > * signalling us. > */ > LockBufHdr(bufHdr); > buf->flags &= ~BM_PIN_COUNT_WAITER; > Assert(bufHdr->wait_backend_pid == MyProcPid); > UnlockBufHdr(bufHdr); > PinCountWaitBuf = NULL; > /* Loop back and try again */ > } > to the bottom of the loop would suffice. No, I disagree. If we maintain the rule that the signaler clears BM_PIN_COUNT_WAITER, then once that happens there is nothing to stop a third party from trying to LockBufferForCleanup on the same buffer (except for table-level locking conventions, which IMO this mechanism shouldn't be dependent on). So this coding would potentially clear the BM_PIN_COUNT_WAITER flag belonging to that third party, and then fail the Assert --- but only in debug builds, not in production, where it would just silently lock up the third-party waiter. So I think having a test to verify that it's still "our" BM_PIN_COUNT_WAITER flag is essential. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers