On 2013-04-02 12:10:12 +0200, Andres Freund wrote:
> On 2013-04-01 08:49:16 +0100, Simon Riggs wrote:
> > On 30 March 2013 17:21, Andres Freund <[email protected]> wrote:
> >
> > > So if the xid is later than latestObservedXid we extend subtrans one by
> > > one. So far so good. But we initialize it in
> > > ProcArrayApplyRecoveryInfo() when consistency is initially reached:
> > > latestObservedXid = running->nextXid;
> > > TransactionIdRetreat(latestObservedXid);
> > > Before that subtrans has initially been started up with:
> > > if (wasShutdown)
> > > oldestActiveXID =
> > > PrescanPreparedTransactions(&xids, &nxids);
> > > else
> > > oldestActiveXID =
> > > checkPoint.oldestActiveXid;
> > > ...
> > > StartupSUBTRANS(oldestActiveXID);
> > >
> > > That means its only initialized up to checkPoint.oldestActiveXid. As it
> > > can take some time till we reach consistency it seems rather plausible
> > > that there now will be a gap in initilized pages. From
> > > checkPoint.oldestActiveXid to running->nextXid if there are pages
> > > inbetween.
> >
> > That was an old bug.
> >
> > StartupSUBTRANS() now explicitly fills that gap. Are you saying it
> > does that incorrectly? How?
>
> Well, no. I think StartupSUBTRANS does this correctly, but there's a gap
> between the call to Startup* and the first call to ExtendSUBTRANS. The
> latter is only called *after* we reached STANDBY_INITIALIZED via
> ProcArrayApplyRecoveryInfo(). The problem is that we StartupSUBTRANS to
> checkPoint.oldestActiveXid while we start to ExtendSUBTRANS from
> running->nextXid - 1. There very well can be a gap inbetween.
> The window isn't terribly big but if you use subtransactions as heavily
> as Sergey seems to be it doesn't seem unlikely to hit it.
>
> Let me come up with a testcase and patch.
Developing a testcase was trivial, pgbench running the following function:
CREATE OR REPLACE FUNCTION recurse_and_assign_txid(level bigint DEFAULT 0)
RETURNS bigint
LANGUAGE plpgsql AS $b$
BEGIN
IF level < 500 THEN
RETURN recurse_and_assign_txid(level + 1);
ELSE
-- assign xid in subtxn and parents
CREATE TEMPORARY TABLE foo();
DROP TABLE foo;
RETURN txid_current()::bigint;
END IF;
EXCEPTION WHEN others THEN
RAISE NOTICE 'unexpected';
END
$b$;
When now restarting a standby (so it restarts from another checkpoint) it
frequently crashed with various errors:
* pg_subtrans/xxx does not exist
* (warning) pg_subtrans page does not exist, assuming zero
* xid overwritten in SubTransSetParent
So I think my theory is correct.
The attached patch fixes this although I don't like the way it knowledge of the
point up to which StartupSUBTRANS zeroes pages is handled.
Makes sense?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
>From cbb4e4d30ba1df5a3d370226545efe4966719867 Mon Sep 17 00:00:00 2001
From: Andres Freund <[email protected]>
Date: Tue, 2 Apr 2013 20:20:00 +0200
Subject: [PATCH] Ensure that SUBTRANS is initalized gaplessly when starting
up HS
---
src/backend/access/transam/xlog.c | 3 ++
src/backend/storage/ipc/procarray.c | 53 +++++++++++++++++++++++++++++++----
src/include/storage/procarray.h | 1 +
3 files changed, 51 insertions(+), 6 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 07c68ad..4a9462e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5391,6 +5391,9 @@ StartupXLOG(void)
oldestActiveXID = checkPoint.oldestActiveXid;
Assert(TransactionIdIsValid(oldestActiveXID));
+ /* Tell procarray about the range of xids it has to deal with */
+ ProcArrayInitRecovery(ShmemVariableCache->nextXid);
+
/*
* Startup commit log and subtrans only. Other SLRUs are not
* maintained during recovery and need not be started yet.
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 4308128..53a5aa2 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -470,6 +470,28 @@ ProcArrayClearTransaction(PGPROC *proc)
}
/*
+ * ProcArrayInitRecovery -- initialize recovery xid mgmt environment
+ *
+ * Remember up to where the startup process initialized the CLOG and subtrans
+ * so we can ensure its initialized gaplessly up to the point where necessary
+ * while in recovery.
+ */
+void
+ProcArrayInitRecovery(TransactionId initializedUptoXID)
+{
+ Assert(standbyState == STANDBY_INITIALIZED);
+ Assert(TransactionIdIsNormal(initializedUptoXID));
+
+ /*
+ * we set latestObservedXid to the xid SUBTRANS has been initialized upto
+ * so we can extend it from that point onwards when we reach a consistent
+ * state in ProcArrayApplyRecoveryInfo().
+ */
+ latestObservedXid = initializedUptoXID;
+ TransactionIdRetreat(latestObservedXid);
+}
+
+/*
* ProcArrayApplyRecoveryInfo -- apply recovery info about xids
*
* Takes us through 3 states: Initialized, Pending and Ready.
@@ -564,7 +586,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
Assert(standbyState == STANDBY_INITIALIZED);
/*
- * OK, we need to initialise from the RunningTransactionsData record
+ * OK, we need to initialise from the RunningTransactionsData record.
+ *
+ * NB: this can be reached at least twice, so make sure new code can deal
+ * with that.
*/
/*
@@ -636,20 +661,32 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
pfree(xids);
/*
+ * latestObservedXid is set to the the point where SUBTRANS was started up
+ * to, initialize subtrans from thereon, up to nextXid - 1.
+ */
+ Assert(TransactionIdIsNormal(latestObservedXid));
+ while (TransactionIdPrecedes(latestObservedXid, running->nextXid))
+ {
+ ExtendCLOG(latestObservedXid);
+ ExtendSUBTRANS(latestObservedXid);
+
+ TransactionIdAdvance(latestObservedXid);
+ }
+
+ /* ----------
* Now we've got the running xids we need to set the global values that
* are used to track snapshots as they evolve further.
*
- * - latestCompletedXid which will be the xmax for snapshots -
- * lastOverflowedXid which shows whether snapshots overflow - nextXid
+ * - latestCompletedXid which will be the xmax for snapshots
+ * - lastOverflowedXid which shows whether snapshots overflow
+ * - nextXid
*
* If the snapshot overflowed, then we still initialise with what we know,
* but the recovery snapshot isn't fully valid yet because we know there
* are some subxids missing. We don't know the specific subxids that are
* missing, so conservatively assume the last one is latestObservedXid.
+ * ----------
*/
- latestObservedXid = running->nextXid;
- TransactionIdRetreat(latestObservedXid);
-
if (running->subxid_overflow)
{
standbyState = STANDBY_SNAPSHOT_PENDING;
@@ -719,6 +756,10 @@ ProcArrayApplyXidAssignment(TransactionId topxid,
Assert(standbyState >= STANDBY_INITIALIZED);
+ /* can't do anything useful unless we have more state setup */
+ if (standbyState == STANDBY_INITIALIZED)
+ return;
+
max_xid = TransactionIdLatest(topxid, nsubxids, subxids);
/*
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index d5fdfea..c5f58b4 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -26,6 +26,7 @@ extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
extern void ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid);
extern void ProcArrayClearTransaction(PGPROC *proc);
+extern void ProcArrayInitRecovery(TransactionId initializedUptoXID);
extern void ProcArrayApplyRecoveryInfo(RunningTransactions running);
extern void ProcArrayApplyXidAssignment(TransactionId topxid,
int nsubxids, TransactionId *subxids);
--
1.7.10.4
--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs