On Thu, Jul 10, 2025 at 2:00 AM jian he <jian.universal...@gmail.com> wrote: > > we can add a new boolean field, coerce_to_domain, to NewColumnValue. this > field > is set to true only when changing an existing column's type to a constrained > domain. In such cases, a table scan is enough—no table rewrite is needed. > coerce_to_domain will set to false, if table rewrite is required.
I realized that "coerce_to_domain" is not so good in this context. maybe there are other scenarios, we added a NewColumnValue and we also only need table scan. so I changed it to scan_only. /* * .... * If scan_only is true, it means only a table scan is required. * Currently, this is supported only by the ALTER COLUMN SET DATA TYPE command, * where the column's data type is being changed to a constrained domain. */ typedef struct NewColumnValue { AttrNumber attnum; /* which column */ Expr *expr; /* expression to compute */ ExprState *exprstate; /* execution state */ bool is_generated; /* is it a GENERATED expression? */ bool scan_only; /* table scan only */ } NewColumnValue;
From 6ed4375e742af59ab49f64fd8b56b6c4b80687df Mon Sep 17 00:00:00 2001 From: jian he <jian.universal...@gmail.com> Date: Tue, 26 Aug 2025 11:24:18 +0800 Subject: [PATCH v2 1/1] skip table rewrite when change column type to constrained domain per https://www.postgresql.org/docs/devel/sql-altertable.html says changing the type of an existing column to a constrained domain will trigger a table rewrite. However, after reviewing the relevant context [1] and experimenting with it, I found that it's doable to just a table scan. the main gotcha is struct NewColumnValue. we did "palloc0(sizeof(NewColumnValue));" in ATExecAddColumn, ATExecSetExpression and ATPrepAlterColumnType. ATExecAddColumn: Adding a new column with domain with constraints will cause table rewrite. ATExecSetExpression: for stored generated column will cause table rewrite, we do not support domain over virtual generated columns now. ATPrepAlterColumnType: we only do table rewriting occasionally.see ATColumnChangeRequiresRewrite. If a table rewrite is required, there's nothing we can do about it. We can add a new boolean field scan_only to NewColumnValue. This field is currently set to true only when changing an existing column's type to a constrained domain. In such cases, a table scan alone is sufficient. [1]: https://www.postgresql.org/message-id/20190226061450.ga1665...@rfd.leadboat.com discussion: https://postgr.es/m/cacjufxfx0dupbf5+dbnf3mxcfntz1y7jpt11+tcd_fcyadh...@mail.gmail.com commitfest: https://commitfest.postgresql.org/patch/5907 --- doc/src/sgml/ref/alter_table.sgml | 6 +- src/backend/commands/tablecmds.c | 68 +++++++++++++++++++--- src/test/regress/expected/fast_default.out | 38 ++++++++++++ src/test/regress/sql/fast_default.sql | 31 ++++++++++ 4 files changed, 133 insertions(+), 10 deletions(-) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 8867da6c693..9ea0fc21856 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -1459,9 +1459,9 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM and its indexes to be rewritten. As an exception, when changing the type of an existing column, if the <literal>USING</literal> clause does not change the column - contents and the old type is either binary coercible to the new type - or an unconstrained domain over the new type, a table rewrite is not - needed. However, indexes will still be rebuilt unless the system + contents and the old type is either binary coercible to the new type, + a table rewrite is not needed. + However, indexes will still be rebuilt unless the system can verify that the new index would be logically equivalent to the existing one. For example, if the collation for a column has been changed, an index rebuild is required because the new sort diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index c6dd2e020da..82e72b2873d 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -230,6 +230,10 @@ typedef struct NewConstraint * are just copied from the old table during ATRewriteTable. Note that the * expr is an expression over *old* table values, except when is_generated * is true; then it is an expression over columns of the *new* tuple. + * + * If scan_only is true, it means only a table scan is required. + * Currently, this is supported only by the ALTER COLUMN SET DATA TYPE command, + * where the column's data type is being changed to a constrained domain. */ typedef struct NewColumnValue { @@ -237,6 +241,7 @@ typedef struct NewColumnValue Expr *expr; /* expression to compute */ ExprState *exprstate; /* execution state */ bool is_generated; /* is it a GENERATED expression? */ + bool scan_only; /* table scan only */ } NewColumnValue; /* @@ -6008,7 +6013,7 @@ ATRewriteTables(AlterTableStmt *parsetree, List **wqueue, LOCKMODE lockmode, * rebuild data. */ if (tab->constraints != NIL || tab->verify_new_notnull || - tab->partition_constraint != NULL) + tab->partition_constraint != NULL || tab->newvals) ATRewriteTable(tab, InvalidOid); /* @@ -6118,7 +6123,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) Relation newrel; TupleDesc oldTupDesc; TupleDesc newTupDesc; + TupleDesc old_tmp; bool needscan = false; + bool scan_only = false; List *notnull_attrs; List *notnull_virtual_attrs; int i; @@ -6136,7 +6143,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) oldrel = table_open(tab->relid, NoLock); oldTupDesc = tab->oldDesc; newTupDesc = RelationGetDescr(oldrel); /* includes all mods */ - + old_tmp = CreateTupleDescCopy(oldTupDesc); if (OidIsValid(OIDNewHeap)) { Assert(CheckRelationOidLockedByMe(OIDNewHeap, AccessExclusiveLock, @@ -6203,6 +6210,11 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) /* expr already planned */ ex->exprstate = ExecInitExpr((Expr *) ex->expr, NULL); + if (ex->scan_only && !tab->rewrite && !scan_only) + { + needscan = true; + scan_only = true; + } } notnull_attrs = notnull_virtual_attrs = NIL; @@ -6431,6 +6443,42 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap) * new constraints etc. */ insertslot = oldslot; + + /* + * The tupdesc (newTupDesc) in oldslot already have the updated + * attribute changes. If we use it for below ExecEvalExpr, then + * CheckVarSlotCompatibility will fail. Therefore, we need to + * temporarily set oldslot's tts_tupleDescriptor to oldTupDesc. + * Essentially, what we're doing here is evaluating the + * CoerceToDomain node against the existing table slot. + */ + if (scan_only) + { + Datum values pg_attribute_unused(); + bool isnull pg_attribute_unused(); + insertslot->tts_tupleDescriptor = old_tmp; + econtext->ecxt_scantuple = insertslot; + + foreach(l, tab->newvals) + { + NewColumnValue *ex = lfirst(l); + + if (!ex->scan_only) + continue; + + /* + * we can not use ExecEvalExprNoReturn here, because we + * use ExecInitExpr compile NewColumnValue->expr. Here, + * we only check whether the oldslot value satisfies the + * domain constraint. So it is ok to override the value + * evaluated by ExecEvalExpr. + */ + values = ExecEvalExpr(ex->exprstate, econtext, &isnull); + values = (Datum) 0; + isnull = true; + } + insertslot->tts_tupleDescriptor = newTupDesc; + } } /* Now check any constraints on the possibly-changed tuple */ @@ -14498,10 +14546,20 @@ ATPrepAlterColumnType(List **wqueue, newval->attnum = attnum; newval->expr = (Expr *) transform; newval->is_generated = false; + /* If new type is constrained domain, set scan_only to true */ + newval->scan_only = IsA(transform, CoerceToDomain) && + DomainHasConstraints(targettype); tab->newvals = lappend(tab->newvals, newval); if (ATColumnChangeRequiresRewrite(transform, attnum)) tab->rewrite |= AT_REWRITE_COLUMN_REWRITE; + + /* + * If table rewrite will occur nevertheless, attempting to perform only + * table scan to evaluate CoerceToDomain won't work. + */ + if (tab->rewrite) + newval->scan_only = false; } else if (transform) ereport(ERROR, @@ -14632,12 +14690,10 @@ ATPrepAlterColumnType(List **wqueue, * rewrite in these cases: * * - the old type is binary coercible to the new type - * - the new type is an unconstrained domain over the old type * - {NEW,OLD} or {OLD,NEW} is {timestamptz,timestamp} and the timezone is UTC * * In the case of a constrained domain, we could get by with scanning the - * table and checking the constraint rather than actually rewriting it, but we - * don't currently try to do that. + * table and checking the constraint rather than actually rewriting it. */ static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno) @@ -14655,8 +14711,6 @@ ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno) { CoerceToDomain *d = (CoerceToDomain *) expr; - if (DomainHasConstraints(d->resulttype)) - return true; expr = (Node *) d->arg; } else if (IsA(expr, FuncExpr)) diff --git a/src/test/regress/expected/fast_default.out b/src/test/regress/expected/fast_default.out index ccbcdf8403f..6718b6e745e 100644 --- a/src/test/regress/expected/fast_default.out +++ b/src/test/regress/expected/fast_default.out @@ -323,6 +323,44 @@ DROP DOMAIN domain2; DROP DOMAIN domain3; DROP DOMAIN domain4; DROP FUNCTION foo(INT); +-- Test domains with default value for table rewrite. +CREATE DOMAIN domain1 AS INT CHECK(VALUE > 1) NOT NULL; +CREATE DOMAIN domain2 AS INT CHECK(VALUE > random(min=>10, max=>10)) NOT NULL; +CREATE DOMAIN domain3 AS INT8 CHECK(VALUE < 1); +CREATE TABLE t22(a INT, CONSTRAINT cc CHECK(a > 1), b INT, CONSTRAINT cc1 CHECK(b > 1)); +INSERT INTO t22 VALUES(2, 13); +-- no table rewrite, but fail at evaluating domain constraint +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain1 USING a::domain2::domain1; +ERROR: value for domain domain2 violates check constraint "domain2_check" +-- no table rewrite +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain2 USING b::domain1::domain2; +-- table rewrite, but fail at evaluating domain constraint +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain3; +NOTICE: rewriting table t22 for reason 4 +ERROR: value for domain domain3 violates check constraint "domain3_check" +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain3; +NOTICE: rewriting table t22 for reason 4 +ERROR: value for domain domain3 violates check constraint "domain3_check" +-- table rewrite +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain1 USING (a+12)::INT8::domain2::domain1; +NOTICE: rewriting table t22 for reason 4 +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain1 USING (b+12)::INT8::domain2::domain1; +NOTICE: rewriting table t22 for reason 4 +-- table rewrite +ALTER TABLE t22 + ALTER COLUMN a SET DATA TYPE domain2 USING (a+9)::domain2, + ALTER COLUMN b SET DATA TYPE domain2 USING (b+9)::domain2; +NOTICE: rewriting table t22 for reason 4 +SELECT * FROM t22; + a | b +----+---- + 23 | 34 +(1 row) + +DROP TABLE t22; +DROP DOMAIN domain1; +DROP DOMAIN domain2; +DROP DOMAIN domain3; -- Fall back to full rewrite for volatile expressions CREATE TABLE T(pk INT NOT NULL PRIMARY KEY); INSERT INTO T VALUES (1); diff --git a/src/test/regress/sql/fast_default.sql b/src/test/regress/sql/fast_default.sql index 068dd0bc8aa..3073ef6a744 100644 --- a/src/test/regress/sql/fast_default.sql +++ b/src/test/regress/sql/fast_default.sql @@ -294,6 +294,37 @@ DROP DOMAIN domain3; DROP DOMAIN domain4; DROP FUNCTION foo(INT); +-- Test domains with default value for table rewrite. +CREATE DOMAIN domain1 AS INT CHECK(VALUE > 1) NOT NULL; +CREATE DOMAIN domain2 AS INT CHECK(VALUE > random(min=>10, max=>10)) NOT NULL; +CREATE DOMAIN domain3 AS INT8 CHECK(VALUE < 1); +CREATE TABLE t22(a INT, CONSTRAINT cc CHECK(a > 1), b INT, CONSTRAINT cc1 CHECK(b > 1)); +INSERT INTO t22 VALUES(2, 13); + +-- no table rewrite, but fail at evaluating domain constraint +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain1 USING a::domain2::domain1; +-- no table rewrite +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain2 USING b::domain1::domain2; + +-- table rewrite, but fail at evaluating domain constraint +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain3; +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain3; + +-- table rewrite +ALTER TABLE t22 ALTER COLUMN a SET DATA TYPE domain1 USING (a+12)::INT8::domain2::domain1; +ALTER TABLE t22 ALTER COLUMN b SET DATA TYPE domain1 USING (b+12)::INT8::domain2::domain1; + +-- table rewrite +ALTER TABLE t22 + ALTER COLUMN a SET DATA TYPE domain2 USING (a+9)::domain2, + ALTER COLUMN b SET DATA TYPE domain2 USING (b+9)::domain2; + +SELECT * FROM t22; +DROP TABLE t22; +DROP DOMAIN domain1; +DROP DOMAIN domain2; +DROP DOMAIN domain3; + -- Fall back to full rewrite for volatile expressions CREATE TABLE T(pk INT NOT NULL PRIMARY KEY); -- 2.34.1