On 06/26/2017 04:20 AM, Chapman Flack wrote:
I notice CopyXLogRecordToWAL contains this loop (in the case where
the record being copied is a switch):

while (CurrPos < EndPos)
    /* initialize the next page (if not initialized already) */
    AdvanceXLInsertBuffer(CurrPos, false);
    CurrPos += XLOG_BLCKSZ;

in which it calls, one page at a time, AdvanceXLInsertBuffer, which contains
its own loop able to do a sequence of pages. A comment explains why:

 * We do this one page at a time, to make sure we don't deadlock
 * against ourselves if wal_buffers < XLOG_SEG_SIZE.

I want to make sure I understand what the deadlock potential is
in this case. AdvanceXLInsertBuffer will call WaitXLogInsertionsToFinish
before writing any dirty buffer, and we do hold insertion slot locks
(all of 'em, in the case of a log switch, because that makes
XlogInsertRecord call WALInsertLockAcquireExclusive instead of just
WALInsertLockAcquire for other record types).

Does not the fact we hold all the insertion slots exclude the possibility
that any dirty buffer (preceding the one we're touching) needs to be checked
for in-flight insertions?

Hmm. That's not the problem, though. Imagine that instead of the loop above, you do just:

AdvanceXLInsertBuffer(EndPos, false);

AdvanceXLInsertBuffer() will call XLogWrite(), to flush out any pages before EndPos, to make room in the wal_buffers for the new pages. Before doing that, it will call WaitXLogInsertionsToFinish() to wait for any insertions to those pages to be completed. But the backend itself is advertising the insertion position CurrPos, and it will therefore wait for itself, forever.

I've been thinking along the lines of another parameter to
AdvanceXLInsertBuffer to indicate when the caller is exactly this loop
filling out the tail after a log switch (originally, to avoid filling
in page headers). It now seems to me that, if AdvanceXLInsertBuffer
has that information, it could also be safe for it to skip the
WaitXLogInsertionsToFinish in that case. Would that eliminate the
deadlock potential, and allow the loop in CopyXLogRecordToWAL to be
replaced with a single call to AdvanceXLInsertBuffer and a single
WALInsertLockUpdateInsertingAt ?

Or have I overlooked some other subtlety?

The most straightforward solution would be to just clear each page with memset() in the loop. It's a bit wasteful to clear the page again, just after AdvanceXLInsertBuffer() has initialized it, but this isn't performance-critical.

- Heikki

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to