On 27 September 2016 at 09:23, Craig Ringer <cr...@2ndquadrant.com> wrote:
> On 21 September 2016 at 22:16, Robert Haas <robertmh...@gmail.com> wrote:
>>
>> It might not be too hard to add a second copy of oldestXid in shared
>> memory that is updated before truncation rather than afterward... but
>> yeah, like you, I'm not finding this nearly as straightforward as
>> might have been hoped.
>
> Yeah.
>
> I suspect that'll be the way to go, to add another copy that's updated
> before clog truncation. It just seems ... unclean. Like it shouldn't
> be necessary, something else isn't right. But it's probably the lowest
> pain option.

OK, so summarizing the problem:

slru.c and clog.c have no soft-failure entrypoints where we can look
it up and fail gracefully if the xid isn't in the clog.
vac_truncate_clog() calls TruncateCLOG to chop SLRU pages from the
clog before it takes XidGenLock to advance oldestXid. So we cannot use
oldestXid protected by XidGenLock as an interlock against concurrent
clog truncation to prevent SLRU access errors looking up an xid, and
there's no other candidate lock.

This hasn't been an issue before because nobody looks up arbitrary
user-supplied XIDs in clog, we only look up things that're protected
by datfrozenxid etc. But the whole point of txid_status is to look up
user-supplied xids.

We can't just take ClogControlLock from txid_status() to block
truncation because clog.c expects to own that lock, and takes it (via
slru.c) in TransactionIdGetStatus, with no way to pass an
already_locked state. We'd self-deadlock.

Adding a second copy of oldestXid  - say pendingOldestXid - won't
actually help us unless we also take some suitable LWLock before
updating it, otherwise truncation can continue after we look at
pendingOldestXid but before we do the clog lookup. That means an extra
LWLock for each clog truncation, but compared to the I/O done during
clog truncation and the cost of the SlruScanDirectory() pre-check done
by TruncateCLOG it's nothing.

We could take XidGenLock twice, once to update this new
pendingOldestXid field and once to update oldestXid, but that's a
highly contested lock I'd rather not mess with even on a path that's
not hit much.

Instead, I've added a new LWLock, ClogTruncationLock, for that
purpose. vac_truncate_clog() takes it if it decides to attempt clog
truncation. This lock is held throughout the whole process of clog
truncation and oldestXid advance, so there's no need for a new
pendingOldestXid field in ShmemVariableCache. We just take the lock
then look at oldestXid, knowing that it's guaranteed to correspond to
an existing clog page that won't go away while we're looking.
ClogTruncationLock is utterly uncontested so it's going to have no
meaningful impact compared to all the work we do scanning the clog to
decide whether we're even going to try truncating it, etc.


(BTW, it seems like a pity that lwlocknames.txt doesn't have comments
on each lock. We could have
src/backend/storage/lmgr/generate-lwlocknames.pl transform the #
comments into /* comments */ on the generated header. Thoughts?)




-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 0fcfc84226afa4901277a43ee78ec34a30322e8f Mon Sep 17 00:00:00 2001
From: Craig Ringer <cr...@2ndquadrant.com>
Date: Wed, 21 Dec 2016 15:37:29 +0800
Subject: [PATCH 1/3] Introduce txid_status(bigint) to get status of an xact

If an application loses its connection while a COMMIT request is in
flight, the backend crashes mid-commit, etc, then an application may
not be sure whether or not a commit completed successfully or was
rolled back. While two-phase commit solves this it does so at a
considerable overhead, so introduce a lighter alternative.

txid_status(bigint) lets an application determine the status of a a
commit based on an xid-with-epoch as returned by txid_current() or
similar. Status may be committed, aborted, in-progress (including
prepared xacts) or null if the xact is too old for its commit status
to still be retained because it has passed the wrap-around epoch
boundary.

Applications must call txid_current() in their transactions to make
much use of this since PostgreSQL does not automatically report an xid
to the client when one is assigned.

Introduces TransactionIdInRecentPast(...) for the use of other
functions that need similar logic in future.

There was previously no way to look up an arbitrary xid without
running the risk of having clog truncated out from under you. This
hasn't been a problem because anything looking up xids in clog knows
they're protected by datminxid, but that's not the case for arbitrary
user-supplied XIDs. clog is truncated before we advance oldestXid so
taking XidGenLock is insufficient, and there's no way to look up a
SLRU with soft-failure. So we introduce ClogTruncationLock to guard
against concurrent clog truncation.
---
 doc/src/sgml/func.sgml                   |  31 +++++++
 src/backend/access/transam/clog.c        |   2 +
 src/backend/access/transam/commit_ts.c   |   2 +
 src/backend/access/transam/multixact.c   |   1 +
 src/backend/commands/vacuum.c            |  13 +++
 src/backend/storage/lmgr/lwlocknames.txt |   1 +
 src/backend/utils/adt/txid.c             | 141 +++++++++++++++++++++++++++++++
 src/include/catalog/pg_proc.h            |   2 +
 src/include/utils/builtins.h             |   1 +
 src/test/regress/expected/txid.out       |  68 +++++++++++++++
 src/test/regress/sql/txid.sql            |  38 +++++++++
 11 files changed, 300 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 47fcb30..3123232 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17143,6 +17143,10 @@ SELECT collation for ('foo' COLLATE "de_DE");
     <primary>txid_visible_in_snapshot</primary>
    </indexterm>
 
+   <indexterm>
+    <primary>txid_status</primary>
+   </indexterm>
+
    <para>
     The functions shown in <xref linkend="functions-txid-snapshot">
     provide server transaction information in an exportable form.  The main
@@ -17193,6 +17197,11 @@ SELECT collation for ('foo' COLLATE "de_DE");
        <entry><type>boolean</type></entry>
        <entry>is transaction ID visible in snapshot? (do not use with subtransaction ids)</entry>
       </row>
+      <row>
+       <entry><literal><function>txid_status(<parameter>bigint</parameter>)</function></literal></entry>
+       <entry><type>txid_status</type></entry>
+       <entry>report the status of the given xact - <literal>committed</literal>, <literal>aborted</literal>, <literal>in progress</literal>, or NULL if the xid is too old</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -17263,6 +17272,28 @@ SELECT collation for ('foo' COLLATE "de_DE");
    </para>
 
    <para>
+    <function>txid_status(bigint)</> reports the commit status of a recent
+    transaction.  Applications may use it to determine whether a transaction
+    committed or aborted when the application and/or database server crashed or
+    lost connection while a <literal>COMMIT</literal> command was in progress.
+    The status of a transaction will be reported as one of:
+    <itemizedlist>
+     <listitem><para><literal>'in progress'</></></>
+     <listitem><para><literal>'committed'</></></>
+     <listitem><para><literal>'aborted'</></></>
+     <listitem><para><literal>NULL</> if xid too old</></>
+    </itemizedlist>
+    PostgreSQL discards the commit status transactions after no references to
+    the transaction survive in other active transactions, tables, replication
+    slots, etc. This means that the status of older transactions cannot be
+    determined.  <function>txid_status(bigint)</> returns <literal>NULL</> if a
+    transaction is too old to look up.  Prepared transactions are reported as
+    <literal>in progress</>; applications must check <link
+    linkend="view-pg-prepared-xacts"><literal>pg_prepared_xacts</></> if they
+    need to determine if the xid is a prepared transaction.
+   </para>
+
+   <para>
     The functions shown in <xref linkend="functions-commit-timestamp">
     provide information about transactions that have been already committed.
     These functions mainly provide information about when the transactions
diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 2634476..b35438e 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -644,6 +644,8 @@ TruncateCLOG(TransactionId oldestXact)
 {
 	int			cutoffPage;
 
+	Assert(LWLockHeldByMe(ClogTruncationLock));
+
 	/*
 	 * The cutoff point is the start of the segment containing oldestXact. We
 	 * pass the *page* containing oldestXact to SimpleLruTruncate.
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index a8d275f..6d71a15 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -805,6 +805,8 @@ TruncateCommitTs(TransactionId oldestXact)
 {
 	int			cutoffPage;
 
+	Assert(LWLockHeldByMe(ClogTruncationLock));
+
 	/*
 	 * The cutoff point is the start of the segment containing oldestXact. We
 	 * pass the *page* containing oldestXact to SimpleLruTruncate.
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index e9588a7..a2875e6 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -2937,6 +2937,7 @@ TruncateMultiXact(MultiXactId newOldestMulti, Oid newOldestMultiDB)
 
 	Assert(!RecoveryInProgress());
 	Assert(MultiXactState->finishedStartup);
+	Assert(LWLockHeldByMe(ClogTruncationLock));
 
 	/*
 	 * We can only allow one truncation to happen at once. Otherwise parts of
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 58bbf55..ddc486b 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1153,6 +1153,17 @@ vac_truncate_clog(TransactionId frozenXID,
 		return;
 
 	/*
+	 * Because we advance oldestXid only after truncating the clog, holding
+	 * XidGenLock is insufficient to be sure that a given xid will still exist
+	 * in clog when we go to look up its state. Code that may look up arbitrary
+	 * and possibly old xids should take ClogTruncationLock in shared mode to
+	 * guard against concurrent clog truncation. If such code also takes
+	 * XidGenLock it must be taken after ClogTruncationLog to guard against
+	 * deadlocks.
+	 */
+	LWLockAcquire(ClogTruncationLock, LW_EXCLUSIVE);
+
+	/*
 	 * Truncate CLOG, multixact and CommitTs to the oldest computed value.
 	 */
 	TruncateCLOG(frozenXID);
@@ -1168,6 +1179,8 @@ vac_truncate_clog(TransactionId frozenXID,
 	SetTransactionIdLimit(frozenXID, oldestxid_datoid);
 	SetMultiXactIdLimit(minMulti, minmulti_datoid);
 	AdvanceOldestCommitTsXid(frozenXID);
+
+	LWLockRelease(ClogTruncationLock);
 }
 
 
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index f8996cd..7aebeeb 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -47,3 +47,4 @@ CommitTsLock						39
 ReplicationOriginLock				40
 MultiXactTruncationLock				41
 OldSnapshotTimeMapLock				42
+ClogTruncationLock					43
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index 276075e..7af0abb 100644
--- a/src/backend/utils/adt/txid.c
+++ b/src/backend/utils/adt/txid.c
@@ -21,6 +21,7 @@
 
 #include "postgres.h"
 
+#include "access/clog.h"
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog.h"
@@ -28,6 +29,7 @@
 #include "miscadmin.h"
 #include "libpq/pqformat.h"
 #include "postmaster/postmaster.h"
+#include "storage/lwlock.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
@@ -93,6 +95,77 @@ load_xid_epoch(TxidEpoch *state)
 }
 
 /*
+ * Helper to get a TransactionId from a 64-bit txid with wraparound
+ * detection.
+ *
+ * ERRORs if the txid is in the future. Returns true if the xid is
+ * within the xid wraparound threshold and clog truncation threshold
+ * or if the xid is a permanent xid.  Returns false if the xid is past
+ * the clog truncation threshold or wraparound threshold. Sets
+ * extracted_xid to the 32-bit xid.
+ *
+ * It's only safe to use the extracted_xid for most purposes if the
+ * function returns true, otherwise the clog could be truncated away
+ * or it might be an xid from a prior epoch.
+ *
+ * XIDs older than ShmemVariableCache->oldestXid are treated as too
+ * old to look up because the clog could've been truncated away - even
+ * if they're still far from the xid wraparound theshold. The caller
+ * should have at least a share lock on XidGenLock to prevent
+ * oldestXid from advancing between our oldestXid check and subsequent
+ * lookups of transaction status using the returned xid. Failure to do
+ * so risks ERRORs on clog access but nothing worse.
+ */
+static bool
+TransactionIdInRecentPast(uint64 xid_with_epoch, TransactionId *extracted_xid)
+{
+	uint32			xid_epoch = (uint32) (xid_with_epoch >> 32);
+	TransactionId	xid = (TransactionId) xid_with_epoch;
+	uint32			now_epoch;
+	TransactionId		now_epoch_last_xid;
+	bool result;
+
+	Assert(LWLockHeldByMe(ClogTruncationLock));
+
+	GetNextXidAndEpoch(&now_epoch_last_xid, &now_epoch);
+
+	result = true;
+
+	if (!TransactionIdIsNormal(xid))
+	{
+		/* must be a permanent XID, ignore the epoch and return unchanged */
+	}
+	else if (xid_epoch > now_epoch
+			 || (xid_epoch == now_epoch && xid > now_epoch_last_xid))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("transaction ID "UINT64_FORMAT" is in the future",
+					xid_with_epoch)));
+	}
+	else if (xid_epoch + 1 < now_epoch
+			 || (xid_epoch + 1 == now_epoch && xid < now_epoch_last_xid))
+	{
+		/* xid is wrapped, too far in the past */
+		result = false;
+	}
+	else if (TransactionIdPrecedes(xid, ShmemVariableCache->oldestXid))
+	{
+		/* xid isn't wrapped, but clog could've been truncated away */
+		result = false;
+	}
+	else
+	{
+		Assert(TransactionIdPrecedesOrEquals(xid, now_epoch_last_xid));
+	}
+
+	if (extracted_xid != NULL)
+		*extracted_xid = xid;
+
+	return result;
+}
+
+/*
  * do a TransactionId -> txid conversion for an XID near the given epoch
  */
 static txid
@@ -354,6 +427,9 @@ bad_format:
  *
  *	Return the current toplevel transaction ID as TXID
  *	If the current transaction does not have one, one is assigned.
+ *
+ *	This value has the epoch as the high 32 bits and the 32-bit xid
+ *	as the low 32 bits.
  */
 Datum
 txid_current(PG_FUNCTION_ARGS)
@@ -658,3 +734,68 @@ txid_snapshot_xip(PG_FUNCTION_ARGS)
 		SRF_RETURN_DONE(fctx);
 	}
 }
+
+/*
+ * Report the status of a recent transaction ID, or null for wrapped,
+ * truncated away or otherwise too old XIDs.
+ */
+Datum
+txid_status(PG_FUNCTION_ARGS)
+{
+	const char	   *status;
+	uint64			xid_with_epoch = PG_GETARG_INT64(0);
+	TransactionId	xid;
+
+	/*
+	 * We use oldestXid to determine whether the xid we're examining still has
+	 * commit information retained. We must ensure the clog for the XIDs we're
+	 * examining doesn't get truncated away while we're looking, otherwise we'll
+	 * fail with confusing SLRU access errors. See vac_truncate_clog(..).
+	 */
+	LWLockAcquire(ClogTruncationLock, LW_SHARED);
+	/*
+	 * We should also hold XidGenLock to prevent oldestXid advancing, which can
+	 * happen whether or not actual clog truncation is attempted. We don't want
+	 * to try to look up state for an xid that becomes in-the-future after
+	 * we check if it's in the recent past.
+	 */
+	LWLockAcquire(XidGenLock, LW_SHARED);
+	if (TransactionIdInRecentPast(xid_with_epoch, &xid))
+	{
+		if (!TransactionIdIsValid(xid))
+		{
+			LWLockRelease(XidGenLock);
+			LWLockRelease(ClogTruncationLock);
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("transaction ID "UINT64_FORMAT" is an invalid xid",
+							xid_with_epoch)));
+		}
+
+		if (TransactionIdIsCurrentTransactionId(xid))
+			status = gettext_noop("in progress");
+		else if (TransactionIdDidCommit(xid))
+			status = gettext_noop("committed");
+		else if (TransactionIdDidAbort(xid))
+			status = gettext_noop("aborted");
+		else
+			/*
+			 * can't test TransactionIdIsInProgress here or we race
+			 * with concurrent commit/abort. There's no point anyway,
+			 * since it might then commit/abort just after we check.
+			 */
+			status = gettext_noop("in progress");
+	}
+	else
+	{
+		status = NULL;
+	}
+
+	LWLockRelease(XidGenLock);
+	LWLockRelease(ClogTruncationLock);
+
+	if (status == NULL)
+		PG_RETURN_NULL();
+	else
+		PG_RETURN_TEXT_P(cstring_to_text(status));
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index e2d08ba..0ad870c 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -4928,6 +4928,8 @@ DATA(insert OID = 2947 (  txid_snapshot_xip			PGNSP PGUID 12 1 50 0 0 f f f f t
 DESCR("get set of in-progress txids in snapshot");
 DATA(insert OID = 2948 (  txid_visible_in_snapshot	PGNSP PGUID 12 1  0 0 0 f f f f t f i s 2 0 16 "20 2970" _null_ _null_ _null_ _null_ _null_ txid_visible_in_snapshot _null_ _null_ _null_ ));
 DESCR("is txid visible in snapshot?");
+DATA(insert OID = 3346 (  txid_status				PGNSP PGUID 12 1  0 0 0 f f f f t f v s 1 0 25 "20" _null_ _null_ _null_ _null_ _null_ txid_status _null_ _null_ _null_ ));
+DESCR("commit status of transaction");
 
 /* record comparison using normal comparison rules */
 DATA(insert OID = 2981 (  record_eq		   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "2249 2249" _null_ _null_ _null_ _null_ _null_ record_eq _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 2ae212a..baffa38 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -1227,6 +1227,7 @@ extern Datum txid_snapshot_xmin(PG_FUNCTION_ARGS);
 extern Datum txid_snapshot_xmax(PG_FUNCTION_ARGS);
 extern Datum txid_snapshot_xip(PG_FUNCTION_ARGS);
 extern Datum txid_visible_in_snapshot(PG_FUNCTION_ARGS);
+extern Datum txid_status(PG_FUNCTION_ARGS);
 
 /* uuid.c */
 extern Datum uuid_in(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/txid.out b/src/test/regress/expected/txid.out
index 802ccb9..015dae3 100644
--- a/src/test/regress/expected/txid.out
+++ b/src/test/regress/expected/txid.out
@@ -254,3 +254,71 @@ SELECT txid_current_if_assigned() IS NOT DISTINCT FROM BIGINT :'txid_current';
 (1 row)
 
 COMMIT;
+-- test xid status functions
+BEGIN;
+SELECT txid_current() AS committed \gset
+COMMIT;
+BEGIN;
+SELECT txid_current() AS rolledback \gset
+ROLLBACK;
+BEGIN;
+SELECT txid_current() AS inprogress \gset
+SELECT txid_status(:committed) AS committed;
+ committed 
+-----------
+ committed
+(1 row)
+
+SELECT txid_status(:rolledback) AS rolledback;
+ rolledback 
+------------
+ aborted
+(1 row)
+
+SELECT txid_status(:inprogress) AS inprogress;
+ inprogress  
+-------------
+ in progress
+(1 row)
+
+SELECT txid_status(1); -- BootstrapTransactionId is always committed
+ txid_status 
+-------------
+ committed
+(1 row)
+
+SELECT txid_status(2); -- FrozenTransactionId is always committed
+ txid_status 
+-------------
+ committed
+(1 row)
+
+SELECT txid_status(3); -- in regress testing FirstNormalTransactionId will always be behind oldestXmin
+ txid_status 
+-------------
+ 
+(1 row)
+
+COMMIT;
+BEGIN;
+CREATE FUNCTION test_future_xid_status(bigint)
+RETURNS void
+LANGUAGE plpgsql
+AS
+$$
+BEGIN
+  PERFORM txid_status($1);
+  RAISE EXCEPTION 'didn''t ERROR at xid in the future as expected';
+EXCEPTION
+  WHEN invalid_parameter_value THEN
+    RAISE NOTICE 'Got expected error for xid in the future';
+END;
+$$;
+SELECT test_future_xid_status(:inprogress + 10000);
+NOTICE:  Got expected error for xid in the future
+ test_future_xid_status 
+------------------------
+ 
+(1 row)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/txid.sql b/src/test/regress/sql/txid.sql
index 4aefd9e..bd6decf 100644
--- a/src/test/regress/sql/txid.sql
+++ b/src/test/regress/sql/txid.sql
@@ -59,3 +59,41 @@ SELECT txid_current_if_assigned() IS NULL;
 SELECT txid_current() \gset
 SELECT txid_current_if_assigned() IS NOT DISTINCT FROM BIGINT :'txid_current';
 COMMIT;
+
+-- test xid status functions
+BEGIN;
+SELECT txid_current() AS committed \gset
+COMMIT;
+
+BEGIN;
+SELECT txid_current() AS rolledback \gset
+ROLLBACK;
+
+BEGIN;
+SELECT txid_current() AS inprogress \gset
+
+SELECT txid_status(:committed) AS committed;
+SELECT txid_status(:rolledback) AS rolledback;
+SELECT txid_status(:inprogress) AS inprogress;
+SELECT txid_status(1); -- BootstrapTransactionId is always committed
+SELECT txid_status(2); -- FrozenTransactionId is always committed
+SELECT txid_status(3); -- in regress testing FirstNormalTransactionId will always be behind oldestXmin
+
+COMMIT;
+
+BEGIN;
+CREATE FUNCTION test_future_xid_status(bigint)
+RETURNS void
+LANGUAGE plpgsql
+AS
+$$
+BEGIN
+  PERFORM txid_status($1);
+  RAISE EXCEPTION 'didn''t ERROR at xid in the future as expected';
+EXCEPTION
+  WHEN invalid_parameter_value THEN
+    RAISE NOTICE 'Got expected error for xid in the future';
+END;
+$$;
+SELECT test_future_xid_status(:inprogress + 10000);
+ROLLBACK;
-- 
2.5.5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to