Thanks for re-reviewing! This one I hope is the last version.

On Wed, Apr 28, 2021, at 10:21 AM, Amit Langote wrote:
> I noticed that rd_partdesc_nodetached_xmin can sometimes end up with
> value 0. While you seem to be already aware of that, because otherwise
> you wouldn't have added TransactionIdIsValid(...) in condition in
> RelationGetPartitionDesc(), the comments nearby don't mention why such
> a thing might happen.  Also, I guess it can't be helped that the
> partdesc_nodetached will have to be leaked when the xmin is 0, but
> that shouldn't be as problematic as the case we discussed earlier.

The only case I am aware where that can happen is if the pg_inherits tuple is 
frozen. (That's exactly what the affected test case was testing, note the 
"VACUUM FREEZE pg_inherits" there).  So that test case blew up immediately; but 
I think the real-world chances that people are going to be doing that are 
pretty low, so I'm not really concerned about the leak.

> Would it be a bit more readable to just duplicate this stanza in the
> blocks that assign to rd_partdesc_nodetached and rd_partdesc,
> respectively?  That's not much code to duplicate and it'd be easier to
> see which context is for which partdesc.

Sure .. that's how I first wrote this code.  We don't use that style much, so 
I'm OK with backing out of it.

> +   TransactionId rd_partdesc_nodetached_xmin;  /* xmin for the above */
> 
> Could you please expand this description a bit?

Done.
From 8b1489a17cb5cfcd30d4dfd18020c64e606c5042 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Tue, 27 Apr 2021 19:04:37 -0400
Subject: [PATCH v5] Allow a partdesc-omitting-partitions to be cached

Makes partition descriptor acquisition faster during the transient
period in which a partition is in the process of being detached.

(This incidentally fixes a bug in 8aba9322511 whereby a memory context
holding a transient partdesc is reparented to NULL, leading to permanent
leak of that memory.  Reported by Amit Langote.)

Per gripe from Amit Langote
Discussion: https://postgr.es/m/ca+hiwqfgpp1lxjzobygt9rpvtjxxkg5qg2+xch2z1q7krqz...@mail.gmail.com
---
 src/backend/catalog/pg_inherits.c             |  52 ++++++++-
 src/backend/commands/tablecmds.c              |  66 ++++++-----
 src/backend/commands/trigger.c                |   3 +-
 src/backend/partitioning/partdesc.c           | 101 +++++++++++-----
 src/backend/utils/cache/relcache.c            |  33 +++++-
 src/include/catalog/pg_inherits.h             |   6 +-
 src/include/utils/rel.h                       |  15 +++
 .../detach-partition-concurrently-3.out       | 110 +++++++++++++++++-
 .../detach-partition-concurrently-3.spec      |  14 ++-
 9 files changed, 327 insertions(+), 73 deletions(-)

diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 6447b52854..c373faf2d6 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -52,6 +52,22 @@ typedef struct SeenRelsEntry
  * then no locks are acquired, but caller must beware of race conditions
  * against possible DROPs of child relations.
  *
+ * Partitions marked as being detached are omitted; see
+ * find_inheritance_children_extended for details.
+ */
+List *
+find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
+{
+	return find_inheritance_children_extended(parentrelId, true, lockmode,
+											  NULL, NULL);
+}
+
+/*
+ * find_inheritance_children_extended
+ *
+ * As find_inheritance_children, with more options regarding detached
+ * partitions.
+ *
  * If a partition's pg_inherits row is marked "detach pending",
  * *detached_exist (if not null) is set true.
  *
@@ -60,11 +76,13 @@ typedef struct SeenRelsEntry
  * marked "detach pending" is visible to that snapshot, then that partition is
  * omitted from the output list.  This makes partitions invisible depending on
  * whether the transaction that marked those partitions as detached appears
- * committed to the active snapshot.
+ * committed to the active snapshot.  In addition, *detached_xmin (if not null)
+ * is set to the xmin of the row of the detached partition.
  */
 List *
-find_inheritance_children(Oid parentrelId, bool omit_detached,
-						  LOCKMODE lockmode, bool *detached_exist)
+find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+								   LOCKMODE lockmode, bool *detached_exist,
+								   TransactionId *detached_xmin)
 {
 	List	   *list = NIL;
 	Relation	relation;
@@ -132,7 +150,32 @@ find_inheritance_children(Oid parentrelId, bool omit_detached,
 				snap = GetActiveSnapshot();
 
 				if (!XidInMVCCSnapshot(xmin, snap))
+				{
+					if (detached_xmin)
+					{
+						/*
+						 * Two detached partitions should not occur (see
+						 * checks in MarkInheritDetached), but if they do,
+						 * track the newer of the two.  Make sure to warn the
+						 * user, so that they can clean up.  Since this is
+						 * just a cross-check against potentially corrupt
+						 * catalogs, we don't make it a full-fledged error
+						 * message.
+						 */
+						if (*detached_xmin != InvalidTransactionId)
+						{
+							elog(WARNING, "more than one partition pending detach found for table with OID %u",
+								 parentrelId);
+							if (TransactionIdFollows(xmin, *detached_xmin))
+								*detached_xmin = xmin;
+						}
+						else
+							*detached_xmin = xmin;
+					}
+
+					/* Don't add the partition to the output list */
 					continue;
+				}
 			}
 		}
 
@@ -247,8 +290,7 @@ find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, List **numparents)
 		ListCell   *lc;
 
 		/* Get the direct children of this rel */
-		currentchildren = find_inheritance_children(currentrel, true,
-													lockmode, NULL);
+		currentchildren = find_inheritance_children(currentrel, lockmode);
 
 		/*
 		 * Add to the queue only those children not already seen. This avoids
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8e717ada28..d9ba87a2a3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3492,7 +3492,7 @@ renameatt_internal(Oid myrelid,
 		 * expected_parents will only be 0 if we are not already recursing.
 		 */
 		if (expected_parents == 0 &&
-			find_inheritance_children(myrelid, true, NoLock, NULL) != NIL)
+			find_inheritance_children(myrelid, NoLock) != NIL)
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 					 errmsg("inherited column \"%s\" must be renamed in child tables too",
@@ -3691,7 +3691,7 @@ rename_constraint_internal(Oid myrelid,
 		else
 		{
 			if (expected_parents == 0 &&
-				find_inheritance_children(myrelid, true, NoLock, NULL) != NIL)
+				find_inheritance_children(myrelid, NoLock) != NIL)
 				ereport(ERROR,
 						(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 						 errmsg("inherited constraint \"%s\" must be renamed in child tables too",
@@ -6565,7 +6565,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	 */
 	if (colDef->identity &&
 		recurse &&
-		find_inheritance_children(myrelid, true, NoLock, NULL) != NIL)
+		find_inheritance_children(myrelid, NoLock) != NIL)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 				 errmsg("cannot recursively add identity column to table that has child tables")));
@@ -6811,7 +6811,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	 * use find_all_inheritors to do it in one pass.
 	 */
 	children =
-		find_inheritance_children(RelationGetRelid(rel), true, lockmode, NULL);
+		find_inheritance_children(RelationGetRelid(rel), lockmode);
 
 	/*
 	 * If we are told not to recurse, there had better not be any child
@@ -7674,7 +7674,7 @@ ATPrepDropExpression(Relation rel, AlterTableCmd *cmd, bool recurse, bool recurs
 	 * resulting state can be properly dumped and restored.
 	 */
 	if (!recurse &&
-		find_inheritance_children(RelationGetRelid(rel), true, lockmode, NULL))
+		find_inheritance_children(RelationGetRelid(rel), lockmode))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("ALTER TABLE / DROP EXPRESSION must be applied to child tables too")));
@@ -8282,7 +8282,7 @@ ATExecDropColumn(List **wqueue, Relation rel, const char *colName,
 	 * use find_all_inheritors to do it in one pass.
 	 */
 	children =
-		find_inheritance_children(RelationGetRelid(rel), true, lockmode, NULL);
+		find_inheritance_children(RelationGetRelid(rel), lockmode);
 
 	if (children)
 	{
@@ -8770,7 +8770,7 @@ ATAddCheckConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
 	 * use find_all_inheritors to do it in one pass.
 	 */
 	children =
-		find_inheritance_children(RelationGetRelid(rel), true, lockmode, NULL);
+		find_inheritance_children(RelationGetRelid(rel), lockmode);
 
 	/*
 	 * Check if ONLY was specified with ALTER TABLE.  If so, allow the
@@ -11303,8 +11303,7 @@ ATExecDropConstraint(Relation rel, const char *constrName,
 	 * use find_all_inheritors to do it in one pass.
 	 */
 	if (!is_no_inherit_constraint)
-		children = find_inheritance_children(RelationGetRelid(rel), true,
-											 lockmode, NULL);
+		children = find_inheritance_children(RelationGetRelid(rel), lockmode);
 	else
 		children = NIL;
 
@@ -11688,8 +11687,7 @@ ATPrepAlterColumnType(List **wqueue,
 		}
 	}
 	else if (!recursing &&
-			 find_inheritance_children(RelationGetRelid(rel), true,
-									   NoLock, NULL) != NIL)
+			 find_inheritance_children(RelationGetRelid(rel), NoLock) != NIL)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
 				 errmsg("type of inherited column \"%s\" must be changed in child tables too",
@@ -14601,7 +14599,8 @@ ATExecDropInherit(Relation rel, RangeVar *parent, LOCKMODE lockmode)
  * MarkInheritDetached
  *
  * Set inhdetachpending for a partition, for ATExecDetachPartition
- * in concurrent mode.
+ * in concurrent mode.  While at it, verify that no other partition is
+ * already pending detach.
  */
 static void
 MarkInheritDetached(Relation child_rel, Relation parent_rel)
@@ -14617,32 +14616,45 @@ MarkInheritDetached(Relation child_rel, Relation parent_rel)
 	Assert(parent_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE);
 
 	/*
-	 * Find pg_inherits entries by inhrelid.
+	 * Find pg_inherits entries by inhparent.  (We need to scan them all in
+	 * order to verify that no other partition is pending detach.)
 	 */
 	catalogRelation = table_open(InheritsRelationId, RowExclusiveLock);
 	ScanKeyInit(&key,
-				Anum_pg_inherits_inhrelid,
+				Anum_pg_inherits_inhparent,
 				BTEqualStrategyNumber, F_OIDEQ,
-				ObjectIdGetDatum(RelationGetRelid(child_rel)));
-	scan = systable_beginscan(catalogRelation, InheritsRelidSeqnoIndexId,
+				ObjectIdGetDatum(RelationGetRelid(parent_rel)));
+	scan = systable_beginscan(catalogRelation, InheritsParentIndexId,
 							  true, NULL, 1, &key);
 
 	while (HeapTupleIsValid(inheritsTuple = systable_getnext(scan)))
 	{
-		HeapTuple	newtup;
+		Form_pg_inherits inhForm;
 
-		if (((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhparent !=
-			RelationGetRelid(parent_rel))
-			elog(ERROR, "bad parent tuple found for partition %u",
-				 RelationGetRelid(child_rel));
+		inhForm = (Form_pg_inherits) GETSTRUCT(inheritsTuple);
+		if (inhForm->inhdetachpending)
+			ereport(ERROR,
+					errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+					errmsg("partition \"%s\" already pending detach in partitioned table \"%s.%s\"",
+						   get_rel_name(inhForm->inhrelid),
+						   get_namespace_name(parent_rel->rd_rel->relnamespace),
+						   RelationGetRelationName(parent_rel)),
+					errhint("Use ALTER TABLE ... DETACH PARTITION ... FINALIZE to complete the detach operation."));
+
+		if (inhForm->inhrelid == RelationGetRelid(child_rel))
+		{
+			HeapTuple	newtup;
 
-		newtup = heap_copytuple(inheritsTuple);
-		((Form_pg_inherits) GETSTRUCT(newtup))->inhdetachpending = true;
+			newtup = heap_copytuple(inheritsTuple);
+			((Form_pg_inherits) GETSTRUCT(newtup))->inhdetachpending = true;
 
-		CatalogTupleUpdate(catalogRelation,
-						   &inheritsTuple->t_self,
-						   newtup);
-		found = true;
+			CatalogTupleUpdate(catalogRelation,
+							   &inheritsTuple->t_self,
+							   newtup);
+			found = true;
+			heap_freetuple(newtup);
+			/* keep looking, to ensure we catch others pending detach */
+		}
 	}
 
 	/* Done */
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index d8393aa4de..f305f8bc0f 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -1141,8 +1141,7 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString,
 			ListCell   *l;
 			List	   *idxs = NIL;
 
-			idxs = find_inheritance_children(indexOid, true,
-											 ShareRowExclusiveLock, NULL);
+			idxs = find_inheritance_children(indexOid, ShareRowExclusiveLock);
 			foreach(l, idxs)
 				childTbls = lappend_oid(childTbls,
 										IndexGetRelation(lfirst_oid(l),
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c
index 2305dff407..fa9729283a 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -54,6 +54,12 @@ static PartitionDesc RelationBuildPartitionDesc(Relation rel,
 /*
  * RelationGetPartitionDesc -- get partition descriptor, if relation is partitioned
  *
+ * We keep two partdescs in relcache: rd_partdesc includes all partitions
+ * (even those being concurrently marked detached), while rd_partdesc_nodetach
+ * omits (some of) those.  We store the pg_inherits.xmin value for the latter,
+ * to determine whether it can be validly reused in each case, since that
+ * depends on the active snapshot.
+ *
  * Note: we arrange for partition descriptors to not get freed until the
  * relcache entry's refcount goes to zero (see hacks in RelationClose,
  * RelationClearRelation, and RelationBuildPartitionDesc).  Therefore, even
@@ -61,13 +67,6 @@ static PartitionDesc RelationBuildPartitionDesc(Relation rel,
  * for callers to continue to use that pointer as long as (a) they hold the
  * relation open, and (b) they hold a relation lock strong enough to ensure
  * that the data doesn't become stale.
- *
- * The above applies to partition descriptors that are complete regarding
- * partitions concurrently being detached.  When a descriptor that omits
- * partitions being detached is requested (and such partitions are present),
- * said descriptor is not part of relcache and so it isn't freed by
- * invalidations either.  Caller must not use such a descriptor beyond the
- * current Portal.
  */
 PartitionDesc
 RelationGetPartitionDesc(Relation rel, bool omit_detached)
@@ -78,11 +77,36 @@ RelationGetPartitionDesc(Relation rel, bool omit_detached)
 	 * If relcache has a partition descriptor, use that.  However, we can only
 	 * do so when we are asked to include all partitions including detached;
 	 * and also when we know that there are no detached partitions.
+	 *
+	 * If there is no active snapshot, detached partitions aren't omitted
+	 * either, so we can use the cached descriptor too in that case.
 	 */
 	if (likely(rel->rd_partdesc &&
-			   (!rel->rd_partdesc->detached_exist || !omit_detached)))
+			   (!rel->rd_partdesc->detached_exist || !omit_detached ||
+				!ActiveSnapshotSet())))
 		return rel->rd_partdesc;
 
+	/*
+	 * If we're asked to omit detached partitions, we may be able to use a
+	 * cached descriptor too.  We determine that based on the pg_inherits.xmin
+	 * that was saved alongside that descriptor: if the xmin that was not in
+	 * progress for that active snapshot is also not in progress for the
+	 * current active snapshot, then we can use use it.  Otherwise build one
+	 * from scratch.
+	 */
+	if (omit_detached &&
+		rel->rd_partdesc_nodetached &&
+		TransactionIdIsValid(rel->rd_partdesc_nodetached_xmin) &&
+		ActiveSnapshotSet())
+	{
+		Snapshot	activesnap;
+
+		activesnap = GetActiveSnapshot();
+
+		if (!XidInMVCCSnapshot(rel->rd_partdesc_nodetached_xmin, activesnap))
+			return rel->rd_partdesc_nodetached;
+	}
+
 	return RelationBuildPartitionDesc(rel, omit_detached);
 }
 
@@ -117,6 +141,8 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
 	Oid		   *oids = NULL;
 	bool	   *is_leaf = NULL;
 	bool		detached_exist;
+	bool		is_omit;
+	TransactionId detached_xmin;
 	ListCell   *cell;
 	int			i,
 				nparts;
@@ -132,8 +158,11 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
 	 * some well-defined point in time.
 	 */
 	detached_exist = false;
-	inhoids = find_inheritance_children(RelationGetRelid(rel), omit_detached,
-										NoLock, &detached_exist);
+	detached_xmin = InvalidTransactionId;
+	inhoids = find_inheritance_children_extended(RelationGetRelid(rel),
+												 omit_detached, NoLock,
+												 &detached_exist,
+												 &detached_xmin);
 	nparts = list_length(inhoids);
 
 	/* Allocate working arrays for OIDs, leaf flags, and boundspecs. */
@@ -280,38 +309,50 @@ RelationBuildPartitionDesc(Relation rel, bool omit_detached)
 		MemoryContextSwitchTo(oldcxt);
 	}
 
+	/*
+	 * Are we working with the partition that omits detached partitions, or
+	 * the one that includes them?
+	 */
+	is_omit = omit_detached && detached_exist && ActiveSnapshotSet();
+
 	/*
 	 * We have a fully valid partdesc.  Reparent it so that it has the right
-	 * lifespan, and if appropriate put it into the relation's relcache entry.
+	 * lifespan.
 	 */
-	if (omit_detached && detached_exist)
+	MemoryContextSetParent(new_pdcxt, CacheMemoryContext);
+
+	/*
+	 * Store it into relcache.
+	 *
+	 * But first, a kluge: if there's an old context for this type of
+	 * descriptor, it contains an old partition descriptor that may still be
+	 * referenced somewhere.  Preserve it, while not leaking it, by
+	 * reattaching it as a child context of the new one.  Eventually it will
+	 * get dropped by either RelationClose or RelationClearRelation.
+	 * (We keep the regular partdesc in rd_pdcxt, and the partdesc-excluding-
+	 * detached-partitions in rd_pddcxt.)
+	 */
+	if (is_omit)
 	{
+		if (rel->rd_pddcxt != NULL)
+			MemoryContextSetParent(rel->rd_pddcxt, new_pdcxt);
+		rel->rd_pddcxt = new_pdcxt;
+		rel->rd_partdesc_nodetached = partdesc;
+
 		/*
-		 * A transient partition descriptor is only good for the current
-		 * statement, so make it a child of the current portal's context.
+		 * For partdescs built excluding detached partitions, which we save
+		 * separately, we also record the pg_inherits.xmin of the detached
+		 * partition that was omitted; this informs a future potential user of
+		 * such a cached partdescs.  (This might be InvalidXid; see comments
+		 * in struct RelationData).
 		 */
-		MemoryContextSetParent(new_pdcxt, PortalContext);
+		rel->rd_partdesc_nodetached_xmin = detached_xmin;
 	}
 	else
 	{
-		/*
-		 * This partdesc goes into relcache.
-		 */
-
-		MemoryContextSetParent(new_pdcxt, CacheMemoryContext);
-
-		/*
-		 * But first, a kluge: if there's an old rd_pdcxt, it contains an old
-		 * partition descriptor that may still be referenced somewhere.
-		 * Preserve it, while not leaking it, by reattaching it as a child
-		 * context of the new rd_pdcxt.  Eventually it will get dropped by
-		 * either RelationClose or RelationClearRelation.
-		 */
 		if (rel->rd_pdcxt != NULL)
 			MemoryContextSetParent(rel->rd_pdcxt, new_pdcxt);
 		rel->rd_pdcxt = new_pdcxt;
-
-		/* Store it into relcache */
 		rel->rd_partdesc = partdesc;
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 466c28d528..31c8e07f2a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1157,7 +1157,10 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
 	relation->rd_partkey = NULL;
 	relation->rd_partkeycxt = NULL;
 	relation->rd_partdesc = NULL;
+	relation->rd_partdesc_nodetached = NULL;
+	relation->rd_partdesc_nodetached_xmin = InvalidTransactionId;
 	relation->rd_pdcxt = NULL;
+	relation->rd_pddcxt = NULL;
 	relation->rd_partcheck = NIL;
 	relation->rd_partcheckvalid = false;
 	relation->rd_partcheckcxt = NULL;
@@ -2103,10 +2106,16 @@ RelationClose(Relation relation)
 	 * stale partition descriptors it has.  This is unlikely, so check to see
 	 * if there are child contexts before expending a call to mcxt.c.
 	 */
-	if (RelationHasReferenceCountZero(relation) &&
-		relation->rd_pdcxt != NULL &&
-		relation->rd_pdcxt->firstchild != NULL)
-		MemoryContextDeleteChildren(relation->rd_pdcxt);
+	if (RelationHasReferenceCountZero(relation))
+	{
+		if (relation->rd_pdcxt != NULL &&
+			relation->rd_pdcxt->firstchild != NULL)
+			MemoryContextDeleteChildren(relation->rd_pdcxt);
+
+		if (relation->rd_pddcxt != NULL &&
+			relation->rd_pddcxt->firstchild != NULL)
+			MemoryContextDeleteChildren(relation->rd_pddcxt);
+	}
 
 #ifdef RELCACHE_FORCE_RELEASE
 	if (RelationHasReferenceCountZero(relation) &&
@@ -2390,6 +2399,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
 		MemoryContextDelete(relation->rd_partkeycxt);
 	if (relation->rd_pdcxt)
 		MemoryContextDelete(relation->rd_pdcxt);
+	if (relation->rd_pddcxt)
+		MemoryContextDelete(relation->rd_pddcxt);
 	if (relation->rd_partcheckcxt)
 		MemoryContextDelete(relation->rd_partcheckcxt);
 	pfree(relation);
@@ -2644,7 +2655,7 @@ RelationClearRelation(Relation relation, bool rebuild)
 			SWAPFIELD(PartitionKey, rd_partkey);
 			SWAPFIELD(MemoryContext, rd_partkeycxt);
 		}
-		if (newrel->rd_pdcxt != NULL)
+		if (newrel->rd_pdcxt != NULL || newrel->rd_pddcxt != NULL)
 		{
 			/*
 			 * We are rebuilding a partitioned relation with a non-zero
@@ -2672,13 +2683,22 @@ RelationClearRelation(Relation relation, bool rebuild)
 			 * newrel.
 			 */
 			relation->rd_partdesc = NULL;	/* ensure rd_partdesc is invalid */
+			relation->rd_partdesc_nodetached = NULL;
+			relation->rd_partdesc_nodetached_xmin = InvalidTransactionId;
 			if (relation->rd_pdcxt != NULL) /* probably never happens */
 				MemoryContextSetParent(newrel->rd_pdcxt, relation->rd_pdcxt);
 			else
 				relation->rd_pdcxt = newrel->rd_pdcxt;
+			if (relation->rd_pddcxt != NULL)
+				MemoryContextSetParent(newrel->rd_pddcxt, relation->rd_pddcxt);
+			else
+				relation->rd_pddcxt = newrel->rd_pddcxt;
 			/* drop newrel's pointers so we don't destroy it below */
 			newrel->rd_partdesc = NULL;
+			newrel->rd_partdesc_nodetached = NULL;
+			newrel->rd_partdesc_nodetached_xmin = InvalidTransactionId;
 			newrel->rd_pdcxt = NULL;
+			newrel->rd_pddcxt = NULL;
 		}
 
 #undef SWAPFIELD
@@ -6017,7 +6037,10 @@ load_relcache_init_file(bool shared)
 		rel->rd_partkey = NULL;
 		rel->rd_partkeycxt = NULL;
 		rel->rd_partdesc = NULL;
+		rel->rd_partdesc_nodetached = NULL;
+		rel->rd_partdesc_nodetached_xmin = InvalidTransactionId;
 		rel->rd_pdcxt = NULL;
+		rel->rd_pddcxt = NULL;
 		rel->rd_partcheck = NIL;
 		rel->rd_partcheckvalid = false;
 		rel->rd_partcheckcxt = NULL;
diff --git a/src/include/catalog/pg_inherits.h b/src/include/catalog/pg_inherits.h
index 4d28ede5a6..f47c0a8cd8 100644
--- a/src/include/catalog/pg_inherits.h
+++ b/src/include/catalog/pg_inherits.h
@@ -50,8 +50,10 @@ DECLARE_INDEX(pg_inherits_parent_index, 2187, on pg_inherits using btree(inhpare
 #define InheritsParentIndexId  2187
 
 
-extern List *find_inheritance_children(Oid parentrelId, bool omit_detached,
-									   LOCKMODE lockmode, bool *detached_exist);
+extern List *find_inheritance_children(Oid parentrelId, LOCKMODE lockmode);
+extern List *find_inheritance_children_extended(Oid parentrelId, bool omit_detached,
+												LOCKMODE lockmode, bool *detached_exist, TransactionId *detached_xmin);
+
 extern List *find_all_inheritors(Oid parentrelId, LOCKMODE lockmode,
 								 List **parents);
 extern bool has_subclass(Oid relationId);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 34e25eb597..07ae2c70a9 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -129,6 +129,21 @@ typedef struct RelationData
 	PartitionDesc rd_partdesc;	/* partition descriptor, or NULL */
 	MemoryContext rd_pdcxt;		/* private context for rd_partdesc, if any */
 
+	/* Same as above, for partdescs that omit detached partitions */
+	PartitionDesc rd_partdesc_nodetached;	/* partdesc w/o detached parts */
+	MemoryContext rd_pddcxt;	/* for rd_partdesc_nodetached, if any */
+
+	/*
+	 * pg_inherits.xmin of the partition that was excluded in
+	 * rd_partdesc_nodetached.  This informs a future user of that partdesc:
+	 * if this value is not in progress for the active snapshot, then the
+	 * partdesc can be used, otherwise they have to build a new one.  (This
+	 * matches what find_inheritance_children_extended would do).  In the rare
+	 * case where the pg_inherits tuple has been frozen, this will be
+	 * InvalidXid; behave as if the partdesc is unusable in that case.
+	 */
+	TransactionId rd_partdesc_nodetached_xmin;
+
 	/* data managed by RelationGetPartitionQual: */
 	List	   *rd_partcheck;	/* partition CHECK quals */
 	bool		rd_partcheckvalid;	/* true if list has been computed */
diff --git a/src/test/isolation/expected/detach-partition-concurrently-3.out b/src/test/isolation/expected/detach-partition-concurrently-3.out
index 88e83638c7..bbb9d151fb 100644
--- a/src/test/isolation/expected/detach-partition-concurrently-3.out
+++ b/src/test/isolation/expected/detach-partition-concurrently-3.out
@@ -20,6 +20,7 @@ step s1describe: SELECT 'd3_listp' AS root, * FROM pg_partition_tree('d3_listp')
 root           relid          parentrelid    isleaf         level          
 
 d3_listp       d3_listp                      f              0              
+d3_listp       d3_listp2      d3_listp       t              1              
 d3_listp1      d3_listp1                     t              0              
 step s1alter: ALTER TABLE d3_listp1 ALTER a DROP NOT NULL;
 ERROR:  cannot alter partition "d3_listp1" with an incomplete detach
@@ -81,6 +82,59 @@ error in steps s1cancel s2detach: ERROR:  canceling statement due to user reques
 step s1c: COMMIT;
 step s1insertpart: INSERT INTO d3_listp1 VALUES (1);
 
+starting permutation: s2snitch s1b s1s s2detach2 s1cancel s1c s1brr s1insert s1s s1insert s1c
+step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
+step s1b: BEGIN;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s2detach2: ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY; <waiting ...>
+step s1cancel: SELECT pg_cancel_backend(pid) FROM d3_pid;
+pg_cancel_backend
+
+t              
+step s2detach2: <... completed>
+error in steps s1cancel s2detach2: ERROR:  canceling statement due to user request
+step s1c: COMMIT;
+step s1brr: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step s1insert: INSERT INTO d3_listp VALUES (1);
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+1              
+step s1insert: INSERT INTO d3_listp VALUES (1);
+step s1c: COMMIT;
+
+starting permutation: s2snitch s1b s1s s2detach2 s1cancel s1c s1brr s1s s1insert s1s s1c
+step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
+step s1b: BEGIN;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s2detach2: ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY; <waiting ...>
+step s1cancel: SELECT pg_cancel_backend(pid) FROM d3_pid;
+pg_cancel_backend
+
+t              
+step s2detach2: <... completed>
+error in steps s1cancel s2detach2: ERROR:  canceling statement due to user request
+step s1c: COMMIT;
+step s1brr: BEGIN ISOLATION LEVEL REPEATABLE READ;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s1insert: INSERT INTO d3_listp VALUES (1);
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+1              
+step s1c: COMMIT;
+
 starting permutation: s2snitch s1b s1s s2detach s1cancel s1c s1drop s1list
 step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
 step s1b: BEGIN;
@@ -123,6 +177,61 @@ a
 
 1              
 
+starting permutation: s2snitch s1b s1s s2detach s1cancel s2detach2 s1c
+step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
+step s1b: BEGIN;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s2detach: ALTER TABLE d3_listp DETACH PARTITION d3_listp1 CONCURRENTLY; <waiting ...>
+step s1cancel: SELECT pg_cancel_backend(pid) FROM d3_pid;
+pg_cancel_backend
+
+t              
+step s2detach: <... completed>
+error in steps s1cancel s2detach: ERROR:  canceling statement due to user request
+step s2detach2: ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY;
+ERROR:  partition "d3_listp1" already pending detach in partitioned table "public.d3_listp"
+step s1c: COMMIT;
+
+starting permutation: s2snitch s1b s1s s2detach s1cancel s2detachfinal s1c s2detach2
+step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
+step s1b: BEGIN;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s2detach: ALTER TABLE d3_listp DETACH PARTITION d3_listp1 CONCURRENTLY; <waiting ...>
+step s1cancel: SELECT pg_cancel_backend(pid) FROM d3_pid;
+pg_cancel_backend
+
+t              
+step s2detach: <... completed>
+error in steps s1cancel s2detach: ERROR:  canceling statement due to user request
+step s2detachfinal: ALTER TABLE d3_listp DETACH PARTITION d3_listp1 FINALIZE; <waiting ...>
+step s1c: COMMIT;
+step s2detachfinal: <... completed>
+step s2detach2: ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY;
+
+starting permutation: s2snitch s1b s1s s2detach s1cancel s1c s1droppart s2detach2
+step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
+step s1b: BEGIN;
+step s1s: SELECT * FROM d3_listp;
+a              
+
+1              
+step s2detach: ALTER TABLE d3_listp DETACH PARTITION d3_listp1 CONCURRENTLY; <waiting ...>
+step s1cancel: SELECT pg_cancel_backend(pid) FROM d3_pid;
+pg_cancel_backend
+
+t              
+step s2detach: <... completed>
+error in steps s1cancel s2detach: ERROR:  canceling statement due to user request
+step s1c: COMMIT;
+step s1droppart: DROP TABLE d3_listp1;
+step s2detach2: ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY;
+
 starting permutation: s2snitch s1b s1s s2detach s1cancel s1c s2begin s2drop s1s s2commit
 step s2snitch: INSERT INTO d3_pid SELECT pg_backend_pid();
 step s1b: BEGIN;
@@ -279,4 +388,3 @@ step s2detachfinal: ALTER TABLE d3_listp DETACH PARTITION d3_listp1 FINALIZE;
 step s1insertpart: INSERT INTO d3_listp1 VALUES (1); <waiting ...>
 step s2commit: COMMIT;
 step s1insertpart: <... completed>
-unused step name: s1droppart
diff --git a/src/test/isolation/specs/detach-partition-concurrently-3.spec b/src/test/isolation/specs/detach-partition-concurrently-3.spec
index 4b706430e1..5a8351fc83 100644
--- a/src/test/isolation/specs/detach-partition-concurrently-3.spec
+++ b/src/test/isolation/specs/detach-partition-concurrently-3.spec
@@ -4,12 +4,13 @@ setup
 {
   CREATE TABLE d3_listp (a int) PARTITION BY LIST(a);
   CREATE TABLE d3_listp1 PARTITION OF d3_listp FOR VALUES IN (1);
+  CREATE TABLE d3_listp2 PARTITION OF d3_listp FOR VALUES IN (2);
   CREATE TABLE d3_pid (pid int);
   INSERT INTO d3_listp VALUES (1);
 }
 
 teardown {
-    DROP TABLE IF EXISTS d3_listp, d3_listp1, d3_pid;
+    DROP TABLE IF EXISTS d3_listp, d3_listp1, d3_listp2, d3_pid;
 }
 
 session "s1"
@@ -34,6 +35,7 @@ session "s2"
 step "s2begin"		{ BEGIN; }
 step "s2snitch"		{ INSERT INTO d3_pid SELECT pg_backend_pid(); }
 step "s2detach"		{ ALTER TABLE d3_listp DETACH PARTITION d3_listp1 CONCURRENTLY; }
+step "s2detach2"	{ ALTER TABLE d3_listp DETACH PARTITION d3_listp2 CONCURRENTLY; }
 step "s2detachfinal"	{ ALTER TABLE d3_listp DETACH PARTITION d3_listp1 FINALIZE; }
 step "s2drop"		{ DROP TABLE d3_listp1; }
 step "s2commit"		{ COMMIT; }
@@ -44,11 +46,21 @@ permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s1describe" "s1a
 permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1insert" "s1c"
 permutation "s2snitch" "s1brr" "s1s" "s2detach" "s1cancel" "s1insert" "s1c" "s1spart"
 permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s1insertpart"
+
+# Test partition descriptor caching
+permutation "s2snitch" "s1b" "s1s" "s2detach2" "s1cancel" "s1c" "s1brr" "s1insert" "s1s" "s1insert" "s1c"
+permutation "s2snitch" "s1b" "s1s" "s2detach2" "s1cancel" "s1c" "s1brr" "s1s" "s1insert" "s1s" "s1c"
+
 # "drop" here does both tables
 permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s1drop" "s1list"
 # "truncate" only does parent, not partition
 permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s1trunc" "s1spart"
 
+# If a partition pending detach exists, we cannot drop another one
+permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s2detach2" "s1c"
+permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s2detachfinal" "s1c" "s2detach2"
+permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s1droppart" "s2detach2"
+
 # When a partition with incomplete detach is dropped, we grab lock on parent too.
 permutation "s2snitch" "s1b" "s1s" "s2detach" "s1cancel" "s1c" "s2begin" "s2drop" "s1s" "s2commit"
 
-- 
2.20.1

Reply via email to