On 19/08/2025 03:14, Noah Misch wrote:
On Fri, Aug 15, 2025 at 02:12:03PM +0300, Heikki Linnakangas wrote:
How about always using a fresh snapshot instead? Instead of pushing the
historic snapshot as the active snapshot, _disable_ the historic snapshot
and use GetTransactionSnapshot() to acquire a regular snapshot?

I see advantages in using the historic snapshot:

1. It's the longstanding behavior, and applications aren't complaining.

2. If someone wants "fresh snapshot", they can do that today with a C
    extension that provides an execute_at_fresh_snapshot(sql text) SQL
    function.  If we adopt the fresh snapshot in pglogical or in core, I don't
    see a comparably-clean way for the application code to get back to the
    historic snapshot.  (That's because the historic snapshot lives only in
    stack variables at the moment in question.)

3. If an application is relying on the longstanding behavior and needs to
    adapt to the proposed "fresh snapshot" behavior, that may be invasive to
    implement and harmful to performance.  For example, instead of reading from
    a user_catalog_table inside the filter, the application may need to
    duplicate that table's data into the rows being filtered.

Oh, I had not considered user_catalog_tables. I didn't remember that's a thing.

The docs on user_catalog_table says:

Note that access to user catalog tables or regular system catalog tables in the output plugins has to be done via the systable_* scan APIs only. Access via the heap_* scan APIs will error out.

That doesn't quite say that you should be able to run arbitrary queries on a user_catalog_table. In fact it suggests that you can't, because surely you're not using the systable_* scan APIs when running arbitrary queries.

That said, I agree it would be nice if we can keep it working.

Does the "fresh snapshot" alternative bring strengths to outweigh those?

The argument for the fresh snapshot is that using a historic snapshot only makes sense for catalog tables, and by taking a fresh snapshot, we avoid the mistake of using the historic snapshot for anything else. I thought that there's practically no valid use case for using a historic snapshot in anything that calls GetTransactionSnapshot(), and if it happens to work, it's only because the snapshot isn't actually used for anything or is only used to read data that hasn't changed so that you get away with it.

I agree that reading a table marked as user_catalog_table is valid case, however, so I take back that argument.

How about the attached, then? It reverts the GetTransactionSnapshot() change. But to still catch at least some of the invalid uses of the historic snapshot, it adds checks to heap_beginscan() and index_beginscan(), to complain if they are called on a non-catalog relation with a historic snapshot.

- Heikki
From 610c44f27850cc227c8907e548baee3e709b9fee Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Tue, 19 Aug 2025 23:21:58 +0300
Subject: [PATCH 1/1] Revert GetTransactionSnapshot() to return historic
 snapshot during LR

Commit 1585ff7387 changed GetTransactionSnapshot() to throw an error
if it's called during logical decoding, instead of returning the
historic snapshot. I made that change for extra protection, because a
historic snapshot can only be used to access catalog tables while
GetTransactionSnapshot() is usually called when you're executing
arbitrary queries. You might get very subtle visibility problems if
you tried to use the historic snapshot for arbitrary queries.

There's no built-in code in PostgreSQL that calls
GetTransactionSnapshot() during logical decoding, but it turns out
that the pglogical extension does just that, to evaluate row filter
expressions. You would get weird results if the row filter runs
arbitrary queries, but it is sane as long as you don't access any
non-catalog tables. Even though there are no checks to enforce that in
pglogical, a typical row filter expression does not access any tables
and works fine. Accessing tables marked with the user_catalog_table =
true option is also OK.

To fix pglogical with row filters, and any other extensions that might
do similar things, revert GetTransactionSnapshot() to return a
historic snapshot during logical decoding.

To try to still catch the unsafe usage of historic snapshots, add
checks in heap_beginscan() and index_beginscan() to complain if you
try to use a historic snapshot to scan a non-catalog table.

Backpatch-through: 18
Reported-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/20250809222338.cc.nmisch%40google.com
---
 src/backend/access/heap/heapam.c   |  9 +++++++++
 src/backend/access/index/indexam.c |  8 ++++++++
 src/backend/utils/time/snapmgr.c   | 19 +++++++++++++++----
 src/include/utils/snapmgr.h        |  3 +++
 4 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0dcd6ee817e..ee692c03c3c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1143,6 +1143,15 @@ heap_beginscan(Relation relation, Snapshot snapshot,
 	if (!(snapshot && IsMVCCSnapshot(snapshot)))
 		scan->rs_base.rs_flags &= ~SO_ALLOW_PAGEMODE;
 
+	/* Check that a historic snapshot is not used for non-catalog tables */
+	if (snapshot &&
+		IsHistoricMVCCSnapshot(snapshot) &&
+		!RelationIsAccessibleInLogicalDecoding(relation))
+	{
+		elog(ERROR, "cannot query non-catalog table \"%s\" during logical decoding",
+			 RelationGetRelationName(relation));
+	}
+
 	/*
 	 * For seqscan and sample scans in a serializable transaction, acquire a
 	 * predicate lock on the entire relation. This is required not only to
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 219df1971da..a1ab4156d27 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -263,6 +263,14 @@ index_beginscan(Relation heapRelation,
 
 	Assert(snapshot != InvalidSnapshot);
 
+	/* Check that a historic snapshot is not used for non-catalog tables */
+	if (IsHistoricMVCCSnapshot(snapshot) &&
+		!RelationIsAccessibleInLogicalDecoding(heapRelation))
+	{
+		elog(ERROR, "cannot query non-catalog table \"%s\" during logical decoding",
+			 RelationGetRelationName(heapRelation));
+	}
+
 	scan = index_beginscan_internal(indexRelation, nkeys, norderbys, snapshot, NULL, false);
 
 	/*
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index ea35f30f494..65561cc6bc3 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -271,12 +271,23 @@ Snapshot
 GetTransactionSnapshot(void)
 {
 	/*
-	 * This should not be called while doing logical decoding.  Historic
-	 * snapshots are only usable for catalog access, not for general-purpose
-	 * queries.
+	 * Return historic snapshot if doing logical decoding.
+	 *
+	 * Historic snapshots are only usable for catalog access, not for
+	 * general-purpose queries.  The caller is responsible for ensuring that
+	 * the snapshot is used correctly! (PostgreSQL code never calls this
+	 * during logical decoding, but extensions can do it.)
 	 */
 	if (HistoricSnapshotActive())
-		elog(ERROR, "cannot take query snapshot during logical decoding");
+	{
+		/*
+		 * We'll never need a non-historic transaction snapshot in this
+		 * (sub-)transaction, so there's no need to be careful to set one up
+		 * for later calls to GetTransactionSnapshot().
+		 */
+		Assert(!FirstSnapshotSet);
+		return HistoricSnapshot;
+	}
 
 	/* First call in transaction? */
 	if (!FirstSnapshotSet)
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index d346be71642..604c1f90216 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -56,6 +56,9 @@ extern PGDLLIMPORT SnapshotData SnapshotToastData;
 	((snapshot)->snapshot_type == SNAPSHOT_MVCC || \
 	 (snapshot)->snapshot_type == SNAPSHOT_HISTORIC_MVCC)
 
+#define IsHistoricMVCCSnapshot(snapshot)  \
+	((snapshot)->snapshot_type == SNAPSHOT_HISTORIC_MVCC)
+
 extern Snapshot GetTransactionSnapshot(void);
 extern Snapshot GetLatestSnapshot(void);
 extern void SnapshotSetCommandId(CommandId curcid);
-- 
2.39.5

Reply via email to