Hi,
Robert, Tom, it'd be great if you could look through this thread. I
think there's a problem here (and it has gotten worse after the
introduction of catalog snapshots). Both of you at least dabbled in
related code.
On 2020-02-29 12:17:07 -0800, Andres Freund wrote:
> On 2020-02-28 22:10:52 -0800, Andres Freund wrote:
> > So, um. What happens is that doDeletion() does a catalog scan, which
> > sets a snapshot. The results of that catalog scan are then used to
> > perform modifications. But at that point there's no guarantee that we
> > still hold *any* snapshot, as e.g. invalidations can trigger the catalog
> > snapshot being released.
> >
> > I don't see how that's safe. Without ->xmin preventing that,
> > intermediate row versions that we did look up could just get vacuumed
> > away, and replaced with a different row. That does seem like a serious
> > issue?
> >
> > I think there's likely a lot of places that can hit this? What makes it
> > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release
> > ->xmin when there previously has been *any* catalog access? Because in
> > contrast to normal table modifications, there's currently nothing at all
> > forcing us to hold a snapshot between catalog lookups an their
> > modifications?
> >
> > Am I missing something? Or is this a fairly significant hole in our
> > arrangements?
>
> I still think that's true. In a first iteration I hacked around the
> problem by explicitly registering a catalog snapshot in
> RemoveTempRelations(). That *sometimes* allows to get through the
> regression tests without the assertions triggering.
The attached two patches (they're not meant to be applied) reliably get
through the regression tests. But I suspect I'd have to at least do a
CLOBBER_CACHE_ALWAYS run to find all the actually vulnerable places.
> But I don't think that's good enough (even if we fixed the other
> potential crashes similarly). The only reason that avoids the asserts is
> because in nearly all other cases there's also a user snapshot that's
> pushed. But that pushed snapshot can have an xmin that's newer than the
> catalog snapshot, which means we're still in danger of tids from catalog
> scans being outdated.
>
> My preliminary conclusion is that it's simply not safe to do
> SnapshotResetXmin() from within InvalidateCatalogSnapshot(),
> PopActiveSnapshot(), UnregisterSnapshotFromOwner() etc. Instead we need
> to defer the SnapshotResetXmin() call until at least
> CommitTransactionCommand()? Outside of that there ought (with exception
> of multi-transaction commands, but they have to be careful anyway) to be
> no "in progress" sequences of related catalog lookups/modifications.
>
> Alternatively we could ensure that all catalog lookup/mod sequences
> ensure that the first catalog snapshot is registered. But that seems
> like a gargantuan task?
I also just noticed comments of this style in a few places
* Start a transaction so we can access pg_database, and get a snapshot.
* We don't have a use for the snapshot itself, but we're interested in
* the secondary effect that it sets RecentGlobalXmin. (This is
critical
* for anything that reads heap pages, because HOT may decide to prune
* them even if the process doesn't attempt to modify any tuples.)
followed by code like
StartTransactionCommand();
(void) GetTransactionSnapshot();
rel = table_open(DatabaseRelationId, AccessShareLock);
scan = table_beginscan_catalog(rel, 0, NULL);
which is not safe at all, unfortunately. The snapshot is not
pushed/active, therefore invalidations processed e.g. as part of the
table_open() could execute a InvalidateCatalogSnapshot(), which in turn
would remove the catalog snapshot from the pairing heap and
SnapshotResetXmin(). And poof, the backend's xmin is gone.
Greetings,
Andres Freund
>From 3b58990c088936122f38d855a5a3900602deacf7 Mon Sep 17 00:00:00 2001
From: Andres Freund <[email protected]>
Date: Mon, 6 Apr 2020 21:28:55 -0700
Subject: [PATCH 1/2] TMP: work around missing snapshot registrations.
This is just what's hit by the tests. It's not an actual fix.
---
src/backend/catalog/namespace.c | 7 +++++++
src/backend/catalog/pg_subscription.c | 4 ++++
src/backend/commands/indexcmds.c | 9 +++++++++
src/backend/commands/tablecmds.c | 8 ++++++++
src/backend/replication/logical/tablesync.c | 12 ++++++++++++
src/backend/replication/logical/worker.c | 4 ++++
src/backend/utils/time/snapmgr.c | 4 ++++
7 files changed, 48 insertions(+)
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 2ec23016fe5..e4696d8d417 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -55,6 +55,7 @@
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/varlena.h"
@@ -4244,12 +4245,18 @@ RemoveTempRelationsCallback(int code, Datum arg)
{
if (OidIsValid(myTempNamespace)) /* should always be true */
{
+ Snapshot snap;
+
/* Need to ensure we have a usable transaction. */
AbortOutOfAnyTransaction();
StartTransactionCommand();
+ /* ensure xmin stays set */
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
RemoveTempRelations(myTempNamespace);
+ UnregisterSnapshot(snap);
CommitTransactionCommand();
}
}
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index cb157311154..4a324dfb4f1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -31,6 +31,7 @@
#include "utils/fmgroids.h"
#include "utils/pg_lsn.h"
#include "utils/rel.h"
+#include "utils/snapmgr.h"
#include "utils/syscache.h"
static List *textarray_to_stringlist(ArrayType *textarray);
@@ -286,6 +287,7 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
bool nulls[Natts_pg_subscription_rel];
Datum values[Natts_pg_subscription_rel];
bool replaces[Natts_pg_subscription_rel];
+ Snapshot snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
@@ -321,6 +323,8 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
/* Cleanup. */
table_close(rel, NoLock);
+
+ UnregisterSnapshot(snap);
}
/*
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2baca12c5f4..094bf6139f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2837,6 +2837,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
char *relationName = NULL;
char *relationNamespace = NULL;
PGRUsage ru0;
+ Snapshot snap;
/*
* Create a memory context that will survive forced transaction commits we
@@ -3306,6 +3307,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
*/
StartTransactionCommand();
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
forboth(lc, indexIds, lc2, newIndexIds)
{
@@ -3354,8 +3356,11 @@ ReindexRelationConcurrently(Oid relationOid, int options)
}
/* Commit this transaction and make index swaps visible */
+ UnregisterSnapshot(snap);
CommitTransactionCommand();
+
StartTransactionCommand();
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
/*
* Phase 5 of REINDEX CONCURRENTLY
@@ -3386,7 +3391,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
}
/* Commit this transaction to make the updates visible. */
+ UnregisterSnapshot(snap);
CommitTransactionCommand();
+
StartTransactionCommand();
/*
@@ -3400,6 +3407,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
PushActiveSnapshot(GetTransactionSnapshot());
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
{
ObjectAddresses *objects = new_object_addresses();
@@ -3425,6 +3433,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
}
PopActiveSnapshot();
+ UnregisterSnapshot(snap);
CommitTransactionCommand();
/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6162fb018c7..e1eacc6a4a6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15200,6 +15200,7 @@ PreCommit_on_commit_actions(void)
ListCell *l;
List *oids_to_truncate = NIL;
List *oids_to_drop = NIL;
+ Snapshot snap;
foreach(l, on_commits)
{
@@ -15231,6 +15232,11 @@ PreCommit_on_commit_actions(void)
}
}
+ if (oids_to_truncate == NIL && oids_to_drop == NIL)
+ return;
+
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
/*
* Truncate relations before dropping so that all dependencies between
* relations are removed after they are worked on. Doing it like this
@@ -15284,6 +15290,8 @@ PreCommit_on_commit_actions(void)
}
#endif
}
+
+ UnregisterSnapshot(snap);
}
/*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index c27d9705895..aec5a044790 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -863,6 +863,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
{
Relation rel;
WalRcvExecResult *res;
+ Snapshot snap;
SpinLockAcquire(&MyLogicalRepWorker->relmutex);
MyLogicalRepWorker->relstate = SUBREL_STATE_DATASYNC;
@@ -871,10 +872,14 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
/* Update the state and make it visible to others. */
StartTransactionCommand();
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
UpdateSubscriptionRelState(MyLogicalRepWorker->subid,
MyLogicalRepWorker->relid,
MyLogicalRepWorker->relstate,
MyLogicalRepWorker->relstate_lsn);
+
+ UnregisterSnapshot(snap);
CommitTransactionCommand();
pgstat_report_stat(false);
@@ -918,6 +923,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
CRS_USE_SNAPSHOT, origin_startpos);
PushActiveSnapshot(GetTransactionSnapshot());
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
copy_table(rel);
PopActiveSnapshot();
@@ -933,6 +939,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
/* Make the copy visible. */
CommandCounterIncrement();
+ UnregisterSnapshot(snap);
+
/*
* We are done with the initial data synchronization, update
* the state.
@@ -957,6 +965,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
*/
if (*origin_startpos >= MyLogicalRepWorker->relstate_lsn)
{
+ snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
/*
* Update the new state in catalog. No need to bother
* with the shmem state as we are exiting for good.
@@ -965,6 +975,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
MyLogicalRepWorker->relid,
SUBREL_STATE_SYNCDONE,
*origin_startpos);
+ UnregisterSnapshot(snap);
+
finish_sync_worker();
}
break;
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a752a1224d6..f10f3f843d1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1245,6 +1245,9 @@ apply_handle_truncate(StringInfo s)
ensure_transaction();
+ /* catalog modifications need a set snapshot */
+ PushActiveSnapshot(GetTransactionSnapshot());
+
remote_relids = logicalrep_read_truncate(s, &cascade, &restart_seqs);
foreach(lc, remote_relids)
@@ -1332,6 +1335,7 @@ apply_handle_truncate(StringInfo s)
}
CommandCounterIncrement();
+ PopActiveSnapshot();
}
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 1c063c592ce..b5cff157bf6 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -441,6 +441,8 @@ GetOldestSnapshot(void)
Snapshot
GetCatalogSnapshot(Oid relid)
{
+ Assert(IsTransactionState());
+
/*
* Return historic snapshot while we're doing logical decoding, so we can
* see the appropriate state of the catalog.
@@ -1017,6 +1019,8 @@ SnapshotResetXmin(void)
if (pairingheap_is_empty(&RegisteredSnapshots))
{
MyPgXact->xmin = InvalidTransactionId;
+ TransactionXmin = InvalidTransactionId;
+ RecentXmin = InvalidTransactionId;
return;
}
--
2.25.0.114.g5b0ca878e0
>From 076c589dff7e08f0a6b562b185f179da4fbfc13a Mon Sep 17 00:00:00 2001
From: Andres Freund <[email protected]>
Date: Mon, 6 Apr 2020 21:28:55 -0700
Subject: [PATCH 2/2] Improve and extend asserts for a snapshot being set.
---
src/include/utils/snapmgr.h | 2 ++
src/backend/access/heap/heapam.c | 6 ++++--
src/backend/access/index/indexam.c | 8 +++++++-
src/backend/catalog/indexing.c | 11 +++++++++++
src/backend/utils/time/snapmgr.c | 19 +++++++++++++++++++
contrib/amcheck/verify_nbtree.c | 6 +++---
6 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index b28d13ce841..7738d6a8e01 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -116,6 +116,8 @@ extern void PopActiveSnapshot(void);
extern Snapshot GetActiveSnapshot(void);
extern bool ActiveSnapshotSet(void);
+extern bool SnapshotSet(void);
+
extern Snapshot RegisterSnapshot(Snapshot snapshot);
extern void UnregisterSnapshot(Snapshot snapshot);
extern Snapshot RegisterSnapshotOnOwner(Snapshot snapshot, ResourceOwner owner);
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c4a5aa616a3..0af51880ccc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1136,6 +1136,8 @@ heap_beginscan(Relation relation, Snapshot snapshot,
{
HeapScanDesc scan;
+ Assert(SnapshotSet());
+
/*
* increment relation ref count while scanning relation
*
@@ -1545,7 +1547,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
at_chain_start = first_call;
skip = !first_call;
- Assert(TransactionIdIsValid(RecentGlobalXmin));
+ Assert(SnapshotSet());
Assert(BufferGetBlockNumber(buffer) == blkno);
/* Scan through possible multiple members of HOT-chain */
@@ -5633,7 +5635,7 @@ heap_abort_speculative(Relation relation, ItemPointer tid)
* if so (vacuum can't subsequently move relfrozenxid to beyond
* TransactionXmin, so there's no race here).
*/
- Assert(TransactionIdIsValid(TransactionXmin));
+ Assert(SnapshotSet() && TransactionIdIsValid(TransactionXmin));
if (TransactionIdPrecedes(TransactionXmin, relation->rd_rel->relfrozenxid))
prune_xid = relation->rd_rel->relfrozenxid;
else
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index a3f77169a79..5d6354dedf5 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -184,6 +184,8 @@ index_insert(Relation indexRelation,
RELATION_CHECKS;
CHECK_REL_PROCEDURE(aminsert);
+ Assert(SnapshotSet());
+
if (!(indexRelation->rd_indam->ampredlocks))
CheckForSerializableConflictIn(indexRelation,
(ItemPointer) NULL,
@@ -256,6 +258,8 @@ index_beginscan_internal(Relation indexRelation,
{
IndexScanDesc scan;
+ Assert(SnapshotSet());
+
RELATION_CHECKS;
CHECK_REL_PROCEDURE(ambeginscan);
@@ -519,7 +523,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
SCAN_CHECKS;
CHECK_SCAN_PROCEDURE(amgettuple);
- Assert(TransactionIdIsValid(RecentGlobalXmin));
+ Assert(SnapshotSet());
/*
* The AM's amgettuple proc finds the next index entry matching the scan
@@ -574,6 +578,8 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot)
bool all_dead = false;
bool found;
+ Assert(SnapshotSet());
+
found = table_index_fetch_tuple(scan->xs_heapfetch, &scan->xs_heaptid,
scan->xs_snapshot, slot,
&scan->xs_heap_continue, &all_dead);
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index d63fcf58cf1..8ba6b3dfa5e 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -22,6 +22,7 @@
#include "catalog/indexing.h"
#include "executor/executor.h"
#include "utils/rel.h"
+#include "utils/snapmgr.h"
/*
@@ -184,6 +185,8 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
{
CatalogIndexState indstate;
+ Assert(SnapshotSet());
+
indstate = CatalogOpenIndexes(heapRel);
simple_heap_insert(heapRel, tup);
@@ -204,6 +207,8 @@ void
CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
CatalogIndexState indstate)
{
+ Assert(SnapshotSet());
+
simple_heap_insert(heapRel, tup);
CatalogIndexInsert(indstate, tup);
@@ -225,6 +230,8 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
{
CatalogIndexState indstate;
+ Assert(SnapshotSet());
+
indstate = CatalogOpenIndexes(heapRel);
simple_heap_update(heapRel, otid, tup);
@@ -245,6 +252,8 @@ void
CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
CatalogIndexState indstate)
{
+ Assert(SnapshotSet());
+
simple_heap_update(heapRel, otid, tup);
CatalogIndexInsert(indstate, tup);
@@ -268,5 +277,7 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
void
CatalogTupleDelete(Relation heapRel, ItemPointer tid)
{
+ Assert(SnapshotSet());
+
simple_heap_delete(heapRel, tid);
}
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index b5cff157bf6..3b148ae30a6 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -857,6 +857,25 @@ ActiveSnapshotSet(void)
return ActiveSnapshot != NULL;
}
+/*
+ * Does this transaction have a snapshot.
+ */
+bool
+SnapshotSet(void)
+{
+ /* can't be safe, because somehow xmin is not set */
+ if (!TransactionIdIsValid(MyPgXact->xmin) && HistoricSnapshot == NULL)
+ return false;
+
+ /*
+ * Can't be safe because no snapshot being active/registered means that
+ * e.g. invalidation processing could change xmin horizon.
+ */
+ return ActiveSnapshot != NULL ||
+ !pairingheap_is_empty(&RegisteredSnapshots) ||
+ HistoricSnapshot != NULL;
+}
+
/*
* RegisterSnapshot
* Register a snapshot as being in use by the current resource owner
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ceaaa271680..8f43f3e9dfb 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -412,10 +412,10 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
Snapshot snapshot = SnapshotAny;
/*
- * RecentGlobalXmin assertion matches index_getnext_tid(). See note on
- * RecentGlobalXmin/B-Tree page deletion.
+ * This assertion matches the one in index_getnext_tid(). See page
+ * recycling/RecentGlobalXmin notes in nbtree README.
*/
- Assert(TransactionIdIsValid(RecentGlobalXmin));
+ Assert(SnapshotSet());
/*
* Initialize state for entire verification operation
--
2.25.0.114.g5b0ca878e0