Hi all,

As a recent poke on a thread of 2019 has reminded me, the current
situation of partitioned tables with unlogged is kind of weird:
https://www.postgresql.org/message-id/15954-b61523bed4b110c4%40postgresql.org

To sum up the situation:
- ALTER TABLE .. SET LOGGED/UNLOGGED does not affect partitioned
tables.
- New partitions don't inherit the loggedness of the partitioned
table.

One of the things that could be done is to forbid the use of UNLOGGED
in partitioned tables, but I'm wondering if we could be smarter than
that to provide a more natural user experience.  I've been
experiencing with that and finished with the POC attached, that uses
the following properties:
- Support ALTER TABLE .. SET LOGGED/UNLOGGED for partitioned tables,
where the command only works on partitioned tables so that's only a
catalog switch.
- CREATE TABLE PARTITION OF would make a new partition inherit the
logged ness of the parent if UNLOGGED is not directly specified.

There are some open questions that need attention:
- How about ONLY?  Would it make sense to support it so as ALTER TABLE
ONLY on a partitioned table does not touch any of its partitions?
Would not specifying ONLY mean that the loggedness of the partitioned
table and all its partitions is changed?  That would mean a burst in
WAL when switching to LOGGED, which was a concern when this feature
was first implemented even for a single table, so for potentially
hundreds of them, that would really hurt if a DBA is not careful.
- CREATE TABLE does not have a LOGGED keyword, hence it is not
possible through the parser to force a partition to have a permanent
persistence even if its partitioned table uses UNLOGGED.

Like tablespaces or even recently access methods, the heuristics can
be fun to define depending on the impact of the table rewrites.  The
attached has the code and regression tests to make the rewrites
cheaper, which is to make new partitions inherit the loggedness of the
parent while altering a parent's property leaves the partitions
untouched.  With  the lack of a LOGGED keyword to force a partition to
be permanent when its partitioned table is unlogged, that's a bit
funny but feel free to comment.

Thanks,
--
Michael
From f6462a5f4b9b35d5ace52d55759fefc350e371d3 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Wed, 24 Apr 2024 16:15:29 +0900
Subject: [PATCH] Support LOGGED/UNLOGGED for partitioned tables

When using ALTET TABLE SET LOGGED/UNLOGGED, indexes and sequences that
are owned by the partitioned table changed need to have their
relpersistence also changed.  There are a few things that need to be
considered here about ONLY:
- If ONLY is used only on a partitioned table, switch the persistence be
switched only for the partitioned table?
- If ONLY is not used, should this operation recurse across all the
partitions?

CREATE TABLE comes with its own limitation because there is no way to
detect in the parser if a new partition should be forced as logged.  For
now, this patch does:
- If UNLOGGED is specified, the partition is unlogged.
- If PERMANENT is found, the partition inherits the loggedness from the
parent.
- There is no way to force a partition to be permanent at query level.
---
 src/backend/commands/tablecmds.c          | 145 ++++++++++++++++++----
 src/test/regress/expected/alter_table.out |  84 +++++++++++++
 src/test/regress/sql/alter_table.sql      |  35 ++++++
 doc/src/sgml/ref/alter_table.sgml         |   6 +
 doc/src/sgml/ref/create_table.sgml        |   5 +
 5 files changed, 252 insertions(+), 23 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3556240c8e..90b3993fc4 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -601,7 +601,9 @@ static ObjectAddress ATExecClusterOn(Relation rel, const char *indexName,
 static void ATExecDropCluster(Relation rel, LOCKMODE lockmode);
 static void ATPrepSetAccessMethod(AlteredTableInfo *tab, Relation rel, const char *amname);
 static void ATExecSetAccessMethodNoStorage(Relation rel, Oid newAccessMethod);
-static bool ATPrepChangePersistence(Relation rel, bool toLogged);
+static void ATExecSetPersistenceNoStorage(Relation rel, char newrelpersistence);
+static void ATPrepSetPersistence(AlteredTableInfo *tab, Relation rel,
+								 bool toLogged);
 static void ATPrepSetTableSpace(AlteredTableInfo *tab, Relation rel,
 								const char *tablespacename, LOCKMODE lockmode);
 static void ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode);
@@ -985,6 +987,30 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 			accessMethodId = get_table_am_oid(default_table_access_method, false);
 	}
 
+	/* determine the persistence to use for a partition */
+	if (stmt->partbound)
+	{
+		/* If UNLOGGED has been specified by the query, just use it */
+		if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP)
+		{
+			/* Nothing to do, should be temp */
+		}
+		else if (stmt->relation->relpersistence == RELPERSISTENCE_UNLOGGED)
+		{
+			/* UNLOGGED has been specified by the query, just use it */
+		}
+		else
+		{
+			/*
+			 * Case of RELPERSISTENCE_PERMANENT, where the query specified
+			 * nothing so inherit the persistency from the parent.
+			 */
+			Assert(list_length(inheritOids) == 1);
+			stmt->relation->relpersistence =
+				get_rel_persistence(linitial_oid(inheritOids));
+		}
+	}
+
 	/*
 	 * Create the relation.  Inherited defaults and constraints are passed in
 	 * for immediate handling --- since they don't need parsing, they can be
@@ -5036,13 +5062,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 				ereport(ERROR,
 						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 						 errmsg("cannot change persistence setting twice")));
-			tab->chgPersistence = ATPrepChangePersistence(rel, true);
-			/* force rewrite if necessary; see comment in ATRewriteTables */
-			if (tab->chgPersistence)
-			{
-				tab->rewrite |= AT_REWRITE_ALTER_PERSISTENCE;
-				tab->newrelpersistence = RELPERSISTENCE_PERMANENT;
-			}
+			ATPrepSetPersistence(tab, rel, true);
 			pass = AT_PASS_MISC;
 			break;
 		case AT_SetUnLogged:	/* SET UNLOGGED */
@@ -5051,13 +5071,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 				ereport(ERROR,
 						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 						 errmsg("cannot change persistence setting twice")));
-			tab->chgPersistence = ATPrepChangePersistence(rel, false);
-			/* force rewrite if necessary; see comment in ATRewriteTables */
-			if (tab->chgPersistence)
-			{
-				tab->rewrite |= AT_REWRITE_ALTER_PERSISTENCE;
-				tab->newrelpersistence = RELPERSISTENCE_UNLOGGED;
-			}
+			ATPrepSetPersistence(tab, rel, false);
 			pass = AT_PASS_MISC;
 			break;
 		case AT_DropOids:		/* SET WITHOUT OIDS */
@@ -5427,6 +5441,14 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab,
 			break;
 		case AT_SetLogged:		/* SET LOGGED */
 		case AT_SetUnLogged:	/* SET UNLOGGED */
+
+			/*
+			 * Only do this for partitioned tables, for which this is just a
+			 * catalog change.  Tables with storage are handled by Phase 3.
+			 */
+			if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE &&
+				tab->chgPersistence)
+				ATExecSetPersistenceNoStorage(rel, tab->newrelpersistence);
 			break;
 		case AT_DropOids:		/* SET WITHOUT OIDS */
 			/* nothing to do here, oid columns don't exist anymore */
@@ -15518,6 +15540,79 @@ ATExecSetAccessMethodNoStorage(Relation rel, Oid newAccessMethodId)
 	table_close(pg_class, RowExclusiveLock);
 }
 
+/*
+ * Special handling of ALTER TABLE SET LOGGED/UNLOGGED for relations with no
+ * storage that have an interest in changing their persistence.
+ *
+ * Since these have no storage, setting the persistence to permanent or
+ * unlogged is a catalog-only operation.  This needs to switch the
+ * persistence of all sequences and indexes related to this relation.
+ */
+static void
+ATExecSetPersistenceNoStorage(Relation rel, char newrelpersistence)
+{
+	Relation	pg_class;
+	HeapTuple	tuple;
+	List	   *reloids;	/* for indexes and sequences */
+	ListCell   *elt;
+	Form_pg_class rd_rel;
+	Oid			reloid = RelationGetRelid(rel);
+
+	/*
+	 * Shouldn't be called on relations having storage; these are processed in
+	 * phase 3.
+	 */
+	Assert(!RELKIND_HAS_STORAGE(rel->rd_rel->relkind));
+
+	/* Get a modifiable copy of the relation's pg_class row. */
+	pg_class = table_open(RelationRelationId, RowExclusiveLock);
+
+	tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(reloid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for relation %u", reloid);
+	rd_rel = (Form_pg_class) GETSTRUCT(tuple);
+
+	/* Leave if no update required */
+	if (rd_rel->relpersistence == newrelpersistence)
+	{
+		heap_freetuple(tuple);
+		table_close(pg_class, RowExclusiveLock);
+		return;
+	}
+
+	/* Update the pg_class row. */
+	rd_rel->relpersistence = newrelpersistence;
+	CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+
+	InvokeObjectPostAlterHook(RelationRelationId, RelationGetRelid(rel), 0);
+
+	heap_freetuple(tuple);
+
+	/* Update the per-sequence and per-index relpersistence */
+	reloids = getOwnedSequences(reloid);
+	reloids = list_union_oid(reloids, RelationGetIndexList(rel));
+	foreach(elt, reloids)
+	{
+		Oid			classoid = lfirst_oid(elt);
+
+		tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(classoid));
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "cache lookup failed for relation %u", classoid);
+		rd_rel = (Form_pg_class) GETSTRUCT(tuple);
+
+		rd_rel->relpersistence = newrelpersistence;
+		CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+		InvokeObjectPostAlterHook(RelationRelationId, classoid, 0);
+
+		heap_freetuple(tuple);
+	}
+
+	/* Make sure the persistence changes are visible */
+	CommandCounterIncrement();
+
+	table_close(pg_class, RowExclusiveLock);
+}
+
 /*
  * ALTER TABLE SET TABLESPACE
  */
@@ -17749,12 +17844,10 @@ ATExecSetCompression(Relation rel,
  * This verifies that we're not trying to change a temp table.  Also,
  * existing foreign key constraints are checked to avoid ending up with
  * permanent tables referencing unlogged tables.
- *
- * Return value is false if the operation is a no-op (in which case the
- * checks are skipped), otherwise true.
  */
-static bool
-ATPrepChangePersistence(Relation rel, bool toLogged)
+static void
+ATPrepSetPersistence(AlteredTableInfo *tab, Relation rel,
+					 bool toLogged)
 {
 	Relation	pg_constraint;
 	HeapTuple	tuple;
@@ -17778,12 +17871,12 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 		case RELPERSISTENCE_PERMANENT:
 			if (toLogged)
 				/* nothing to do */
-				return false;
+				return;
 			break;
 		case RELPERSISTENCE_UNLOGGED:
 			if (!toLogged)
 				/* nothing to do */
-				return false;
+				return;
 			break;
 	}
 
@@ -17866,7 +17959,13 @@ ATPrepChangePersistence(Relation rel, bool toLogged)
 
 	table_close(pg_constraint, AccessShareLock);
 
-	return true;
+	/* force rewrite if necessary; see comment in ATRewriteTables */
+	tab->rewrite |= AT_REWRITE_ALTER_PERSISTENCE;
+	if (toLogged)
+		tab->newrelpersistence = RELPERSISTENCE_PERMANENT;
+	else
+		tab->newrelpersistence = RELPERSISTENCE_UNLOGGED;
+	tab->chgPersistence = true;
 }
 
 /*
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index 7666c76238..5f3caf6baf 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -3597,6 +3597,90 @@ ALTER TABLE logged1 SET UNLOGGED; -- silently do nothing
 DROP TABLE logged3;
 DROP TABLE logged2;
 DROP TABLE logged1;
+-- SET LOGGED/UNLOGGED with partitioned tables
+CREATE TABLE logged_part_1(f1 SERIAL PRIMARY KEY)
+  PARTITION BY LIST (f1); -- has sequence, index
+CREATE TABLE logged_part_2(f1 SERIAL PRIMARY KEY, f2 INTEGER REFERENCES logged_part_1)
+  PARTITION BY LIST (f1); -- foreign key
+CREATE TABLE logged_part_3(f1 SERIAL PRIMARY KEY, f2 INTEGER REFERENCES logged_part_3)
+  PARTITION BY LIST (f1); -- self-referencing foreign key
+-- Check relpersistence of all the objects created.
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part'
+  ORDER BY relname;
+       relname        | relpersistence 
+----------------------+----------------
+ logged_part_1        | p
+ logged_part_1_f1_seq | p
+ logged_part_1_pkey   | p
+ logged_part_2        | p
+ logged_part_2_f1_seq | p
+ logged_part_2_pkey   | p
+ logged_part_3        | p
+ logged_part_3_f1_seq | p
+ logged_part_3_pkey   | p
+(9 rows)
+
+ALTER TABLE logged_part_1 SET UNLOGGED; -- fails as a foreign-key exists
+ERROR:  could not change table "logged_part_1" to unlogged because it references logged table "logged_part_2"
+ALTER TABLE logged_part_2 SET UNLOGGED;
+ALTER TABLE logged_part_3 SET UNLOGGED; -- skip self-referencing foreign key
+-- Re-check relpersistence of all the objects created.
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part'
+  ORDER BY relname;
+       relname        | relpersistence 
+----------------------+----------------
+ logged_part_1        | p
+ logged_part_1_f1_seq | p
+ logged_part_1_pkey   | p
+ logged_part_2        | u
+ logged_part_2_f1_seq | u
+ logged_part_2_pkey   | u
+ logged_part_3        | u
+ logged_part_3_f1_seq | u
+ logged_part_3_pkey   | u
+(9 rows)
+
+ALTER TABLE logged_part_1 SET LOGGED; -- no-op
+ALTER TABLE logged_part_2 SET LOGGED;
+ALTER TABLE logged_part_3 SET LOGGED; -- skip self-referencing foreign key
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part_[2|3]'
+  ORDER BY relname;
+       relname        | relpersistence 
+----------------------+----------------
+ logged_part_2        | p
+ logged_part_2_f1_seq | p
+ logged_part_2_pkey   | p
+ logged_part_3        | p
+ logged_part_3_f1_seq | p
+ logged_part_3_pkey   | p
+(6 rows)
+
+-- Partitions
+CREATE TABLE logged_part_2_1 PARTITION OF logged_part_2
+  FOR VALUES IN (1); -- permanent, inherited
+CREATE UNLOGGED TABLE logged_part_2_2 PARTITION OF logged_part_2
+  FOR VALUES IN (2); -- unlogged, not inherited
+ALTER TABLE logged_part_2 SET UNLOGGED;
+CREATE TABLE logged_part_2_3 PARTITION OF logged_part_2
+  FOR VALUES IN (3); -- unlogged, inherited
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part_2'
+  ORDER BY relname;
+       relname        | relpersistence 
+----------------------+----------------
+ logged_part_2        | u
+ logged_part_2_1      | p
+ logged_part_2_1_pkey | p
+ logged_part_2_2      | u
+ logged_part_2_2_pkey | u
+ logged_part_2_3      | u
+ logged_part_2_3_pkey | u
+ logged_part_2_f1_seq | u
+ logged_part_2_pkey   | u
+(9 rows)
+
+DROP TABLE logged_part_3;
+DROP TABLE logged_part_2;
+DROP TABLE logged_part_1;
 -- test ADD COLUMN IF NOT EXISTS
 CREATE TABLE test_add_column(c1 integer);
 \d test_add_column
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 9df5a63bdf..903e3aa217 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -2259,6 +2259,41 @@ DROP TABLE logged3;
 DROP TABLE logged2;
 DROP TABLE logged1;
 
+-- SET LOGGED/UNLOGGED with partitioned tables
+CREATE TABLE logged_part_1(f1 SERIAL PRIMARY KEY)
+  PARTITION BY LIST (f1); -- has sequence, index
+CREATE TABLE logged_part_2(f1 SERIAL PRIMARY KEY, f2 INTEGER REFERENCES logged_part_1)
+  PARTITION BY LIST (f1); -- foreign key
+CREATE TABLE logged_part_3(f1 SERIAL PRIMARY KEY, f2 INTEGER REFERENCES logged_part_3)
+  PARTITION BY LIST (f1); -- self-referencing foreign key
+-- Check relpersistence of all the objects created.
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part'
+  ORDER BY relname;
+ALTER TABLE logged_part_1 SET UNLOGGED; -- fails as a foreign-key exists
+ALTER TABLE logged_part_2 SET UNLOGGED;
+ALTER TABLE logged_part_3 SET UNLOGGED; -- skip self-referencing foreign key
+-- Re-check relpersistence of all the objects created.
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part'
+  ORDER BY relname;
+ALTER TABLE logged_part_1 SET LOGGED; -- no-op
+ALTER TABLE logged_part_2 SET LOGGED;
+ALTER TABLE logged_part_3 SET LOGGED; -- skip self-referencing foreign key
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part_[2|3]'
+  ORDER BY relname;
+-- Partitions
+CREATE TABLE logged_part_2_1 PARTITION OF logged_part_2
+  FOR VALUES IN (1); -- permanent, inherited
+CREATE UNLOGGED TABLE logged_part_2_2 PARTITION OF logged_part_2
+  FOR VALUES IN (2); -- unlogged, not inherited
+ALTER TABLE logged_part_2 SET UNLOGGED;
+CREATE TABLE logged_part_2_3 PARTITION OF logged_part_2
+  FOR VALUES IN (3); -- unlogged, inherited
+SELECT relname, relpersistence FROM pg_class WHERE relname ~ '^logged_part_2'
+  ORDER BY relname;
+DROP TABLE logged_part_3;
+DROP TABLE logged_part_2;
+DROP TABLE logged_part_1;
+
 -- test ADD COLUMN IF NOT EXISTS
 CREATE TABLE test_add_column(c1 integer);
 \d test_add_column
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index fe36ff82e5..b85d8dcd15 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -803,6 +803,12 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
       (for identity or serial columns).  However, it is also possible to
       change the persistence of such sequences separately.
      </para>
+
+     <para>
+      Setting this property for a partitioned table has no direct effect,
+      because such tables have no storage of their own, but the configured
+      value will be inherited by newly-created partitions.
+     </para>
     </listitem>
    </varlistentry>
 
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 02f31d2d6f..d1ff9d9e95 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -221,6 +221,11 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
       If this is specified, any sequences created together with the unlogged
       table (for identity or serial columns) are also created as unlogged.
      </para>
+
+     <para>
+      When applied to a partitioned table, newly-created partitions and
+      their objects (sequences and indexes) will inherit this property.
+     </para>
     </listitem>
    </varlistentry>
 
-- 
2.43.0

Attachment: signature.asc
Description: PGP signature

Reply via email to