Hi all,
On 2012-11-27 19:52:09 +0000, [email protected] wrote:
> This happens only if wal_level=hot_standby.
>
>
> Here are the steps to reproduce this issue.
Oh. Fucking. Wow.
I think this tiny comment just helped me find the bug. After previously
having looked for it without success for some time I just recognized the
problem while working on something unrelated.
Here it goes:
void
CreateCheckPoint(int flags)
{
...
/*
* Get the other info we need for the checkpoint record.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
checkPoint.nextXid = ShmemVariableCache->nextXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
/* Increase XID epoch if we've wrapped around since last checkpoint */
checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
checkPoint.nextXidEpoch++;
// i.e. compute the epoch based on the *current* nextXid.
...
// do all the writing, take some time
CheckPointGuts(checkPoint.redo, flags);
...
* Update checkPoint.nextXid since we have a later value
*/
if (!shutdown && XLogStandbyInfoActive())
LogStandbySnapshot(&checkPoint.nextXid);
...
/* Update shared-memory copy of checkpoint XID/epoch */
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->ckptXidEpoch = checkPoint.nextXidEpoch;
xlogctl->ckptXid = checkPoint.nextXid;
SpinLockRelease(&xlogctl->info_lck);
}
...
}
Notice the end of the comment above/about LogStandbySnapshot. It updates
nextXid! But it does *not* recheck whether we have had a wraparound +
so it does not increase the epoch counter. Although we might have had a
wraparound inbetween.
Which in turn means the the ShmemVariableCache->nextXid <
xlogctl->ckptXid computation in GetNextXidAndEpoch doesn't return
correct results anymore because it can't recognize that were in a new
epoch now.
Trivial patch attached.
I am not sure that I understand why its interesting to have a newer
->nextXid in the checkpoint, but I don't see anything that would
make it dangerous besides this.
This looks like it also explains #6291 and the slighly different issue
described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=q...@mail.gmail.com,
as the issue is just as present on standbys as it is on primaries.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
>From bf1ff10acf7dbfd9a6046c5fcb4bb816485a1e7b Mon Sep 17 00:00:00 2001
From: Andres Freund <[email protected]>
Date: Sat, 1 Dec 2012 23:51:34 +0100
Subject: [PATCH] Fix xid epoch calculation with wal_level=hot_standby
When wal_level=hot_standby CreateCheckPoint also logs running
transactions. During the computation of those it recomputes
CheckPoint->nextXid. Unfortunately it does so after it has been used to compute
CheckPoint->nextXidEpoch.
Move code around slightly to fix the issue.
This fixes #6291, #7710 and the slighly different issue described in
CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=q...@mail.gmail.com, as the
issue is just as present on standbys as it is on primaries.
Bug found by Daniel Farina and Tarvi Pillessaar
---
src/backend/access/transam/xlog.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8b19976..4760081 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7072,11 +7072,6 @@ CreateCheckPoint(int flags)
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
- /* Increase XID epoch if we've wrapped around since last checkpoint */
- checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
- if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
- checkPoint.nextXidEpoch++;
-
LWLockAcquire(OidGenLock, LW_SHARED);
checkPoint.nextOid = ShmemVariableCache->nextOid;
if (!shutdown)
@@ -7115,6 +7110,14 @@ CreateCheckPoint(int flags)
START_CRIT_SECTION();
/*
+ * Increase XID epoch if we've wrapped around since last checkpoint, do
+ * this after LogStandbySnapshot which updates nextXid.
+ */
+ checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
+ if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
+ checkPoint.nextXidEpoch++;
+
+ /*
* Now insert the checkpoint record into XLOG.
*/
rdata.data = (char *) (&checkPoint);
--
1.7.12.289.g0ce9864.dirty
--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs