On 2017/04/11 0:26, Robert Haas wrote:
> On Sun, Apr 9, 2017 at 10:10 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> While I admit that I've not been paying close attention to the whole
>> table partitioning business, I wonder whether we have any clearly written
>> down specification about (a) how much partition member tables are allowed
>> to deviate schema-wise from their parent, and (b) how DDL semantics on
>> partitioned tables differ from DDL semantics for traditional inheritance.
>> Obviously those are closely related questions.  But the fact that this
>> bug exists at all shows that there's been some lack of clarity on (b),
>> and so I wonder whether we have any clarity on (a) either.
> 
> Children can have constraints (including NOT NULL constraints) which
> parents lack, and can have a different column order, but must have
> exactly the same column names and types.

Also, what is different in the partitioned parent case is that NOT NULL
constraints must be inherited.  That is, one cannot add it only to the parent.

--
-- traditional inheritance
--
create table parent (a int);
create table child () inherits (parent);
alter table only parent add constraint chka check (a > 0);
-- ERROR:  constraint must be added to child tables too

-- the following is OK, because of the explicit NO INHERIT request
alter table only parent add constraint chka check (a > 0) no inherit;

-- the following is also OK, because NOT NULL constraints are not
-- inherited with regular inheritance
alter table only parent alter a set not null;

--
-- partitioning inheritance
--
create table parted_parent (a int) partition by list (a);
create table part partition of parted_parent for values in (1);

-- this is same as traditional inheritance
alter table only parted_parent add constraint chka check (a > 0);
-- ERROR:  constraint must be added to child tables too

-- requesting NO INHERIT doesn't make sense on what is an empty relation
-- by itself
alter table only parted_parent add constraint chka check (a > 0) no inherit;
-- ERROR:  cannot add NO INHERIT constraint to partitioned table
"parted_table"

-- NOT NULL constraint must be inherited too
alter table only parted_parent alter a set not null;
-- ERROR:  constraint must be added to child tables too

-- both ok (no inheritance here)
alter table part alter a set not null;
alter table part alter a drop not null;

-- request whole-tree constraint
alter table parted_parent alter a set not null;
-- can't drop it from a partition
alter table part alter a drop not null;
-- ERROR:  column "a" is marked NOT NULL in parent table

> In Amit's example from the original post, the child has an implicit
> NOT NULL constraint that does not exist in the parent.  p1.b isn't
> declared NOT NULL, but the fact that it is range-partitioned on b
> requires it to be so, just as we would do if b were declared as the
> PRIMARY KEY.  Somehow that's not playing nice with pg_dump, but I'm
> still fuzzy on the details.

Actually, I would like to change the problem definition from "ALTER TABLE
ONLY partitioned_table should be avoided" to "Emitting partition's
attributes separately should be avoided".

There is a shouldPrintColumn() which returns true if a table's attribute
should be emitted in the CREATE TABLE command at all.  For example, it
won't be necessary if the attribute is purely inherited
(attislocal=false).  But once an attribute is determined to not be
emitted, any attribute-level constraints and defaults need to be marked to
be emitted separately (using ALTER TABLE ONLY), which can be undesirable
because of the rules about using ONLY with declarative partitioning
inheritance.

I think we can change shouldPrintColumn() so that it always returns true
if the table is a partition, that is, a partition's attributes should
always be emitted with CREATE TABLE, if necessary (it isn't required if
there isn't a NOT NULL constraint or DEFAULT defined on the attributes).
When dumping a partition using the PARTITION OF syntax, an attribute will
emitted if it has a NOT NULL constraint or a default.

So if we have:

create table p (a int, b int) partition by list (a);
create table p1 partition of p for values in (1) partition by range (b);
create table p11 partition of p1 (
    a not null default '1'
) for values from (1) to (10);

Proposed would make pg_dump -s output the following (some lines removed
for clarity):

CREATE TABLE p (
    a integer,
    b integer
)
PARTITION BY LIST (a);

CREATE TABLE p1 PARTITION OF p (
    b NOT NULL
)
FOR VALUES IN (1)
PARTITION BY RANGE (b);

CREATE TABLE p11 PARTITION OF p1 (
    a DEFAULT 1 NOT NULL,
    b NOT NULL
)
FOR VALUES FROM (1) TO (10);

Which also happens to a legal input to the backend, as far as partitioning
is concerned.  With the existing approach, NOT NULL on p1's b attribute is
emitted as ALTER TABLE ONLY p1 ALTER b SET NOT NULL, which would cause
error, because p1 is partitioned.

Attached patch implements that and also updates some comments.  I think
some updates to pg_dump's regression tests for partitioned tables will be
required, but my attempt to understand how to go about updating the
expected output failed.  I will try again tomorrow.

Thanks,
Amit
>From 372c3069b0f891eb60b63a4e407d32bb3f5a4df2 Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Mon, 3 Apr 2017 11:53:27 +0900
Subject: [PATCH] Fix how pg_dump emits partition attributes

Currently, they will never be emitted inline, because a partition's
attributes are *always* inherited (attislocal is never true in the
case of partitions) and inherited attributes are not emitted.  This
means NOT NULL constraints and DEFAULT, if any, are needed to be
emitted separately using ALTER TABLE ONLY ..., which is undesirable
if the partition is itself partitioned (the ONLY part).  Instead
always emit them, so that NOT NULL and DEFAULT are included with
CREATE TABLE.

This patch includes one more fix: we should not emit WITH OPTIONS
keyword phrase for partitions' columns.

Also update some comments.
---
 src/bin/pg_dump/pg_dump.c | 49 +++++++++++++++++++++++++++++++++++++----------
 src/bin/pg_dump/pg_dump.h |  1 +
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 65a2f2307a..512b45aa70 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5457,6 +5457,7 @@ getTables(Archive *fout, int *numTables)
 	int			i_relpages;
 	int			i_is_identity_sequence;
 	int			i_changed_acl;
+	int			i_relispartition;
 
 	/* Make sure we are in proper schema */
 	selectSourceSchema(fout, "pg_catalog");
@@ -5533,7 +5534,22 @@ getTables(Archive *fout, int *numTables)
 						  "CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text "
 						  "WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text ELSE NULL END AS checkoption, "
 						  "tc.reloptions AS toast_reloptions, "
-						  "c.relkind = '%c' AND EXISTS (SELECT 1 FROM pg_depend WHERE classid = 'pg_class'::regclass AND objid = c.oid AND objsubid = 0 AND refclassid = 'pg_class'::regclass AND deptype = 'i') AS is_identity_sequence, "
+						  "c.relkind = '%c' AND EXISTS (SELECT 1 FROM pg_depend WHERE classid = 'pg_class'::regclass AND objid = c.oid AND objsubid = 0 AND refclassid = 'pg_class'::regclass AND deptype = 'i') AS is_identity_sequence, ",
+						  acl_subquery->data,
+						  racl_subquery->data,
+						  initacl_subquery->data,
+						  initracl_subquery->data,
+						  username_subquery,
+						  RELKIND_SEQUENCE);
+
+		/*
+		 * Tables in >= PG 10 servers can be partitions, which need special
+		 * handling when dumping attributes and attribute-level constraints.
+		 */
+		if (fout->remoteVersion >= 100000)
+			appendPQExpBuffer(query, "c.relispartition, ");
+
+		appendPQExpBuffer(query,
 						  "EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON "
 						  "(c.oid = pip.objoid "
 						  "AND pip.classoid = 'pg_class'::regclass "
@@ -5558,12 +5574,6 @@ getTables(Archive *fout, int *numTables)
 						  "AND pip.objsubid = 0) "
 				   "WHERE c.relkind in ('%c', '%c', '%c', '%c', '%c', '%c', '%c') "
 						  "ORDER BY c.oid",
-						  acl_subquery->data,
-						  racl_subquery->data,
-						  initacl_subquery->data,
-						  initracl_subquery->data,
-						  username_subquery,
-						  RELKIND_SEQUENCE,
 						  attacl_subquery->data,
 						  attracl_subquery->data,
 						  attinitacl_subquery->data,
@@ -5988,6 +5998,7 @@ getTables(Archive *fout, int *numTables)
 	i_reloftype = PQfnumber(res, "reloftype");
 	i_is_identity_sequence = PQfnumber(res, "is_identity_sequence");
 	i_changed_acl = PQfnumber(res, "changed_acl");
+	i_relispartition = PQfnumber(res, "relispartition");
 
 	if (dopt->lockWaitTimeout)
 	{
@@ -6057,6 +6068,10 @@ getTables(Archive *fout, int *numTables)
 		else
 			tblinfo[i].checkoption = pg_strdup(PQgetvalue(res, i, i_checkoption));
 		tblinfo[i].toast_reloptions = pg_strdup(PQgetvalue(res, i, i_toastreloptions));
+		if (i_relispartition >= 0)
+			tblinfo[i].relispartition = (strcmp(PQgetvalue(res, i, i_relispartition), "t") == 0);
+		else
+			tblinfo[i].relispartition = false;
 
 		/* other fields were zeroed above */
 
@@ -8207,7 +8222,8 @@ shouldPrintColumn(DumpOptions *dopt, TableInfo *tbinfo, int colno)
 {
 	if (dopt->binary_upgrade)
 		return true;
-	return (tbinfo->attislocal[colno] && !tbinfo->attisdropped[colno]);
+	return ((tbinfo->attislocal[colno] || tbinfo->relispartition) &&
+			!tbinfo->attisdropped[colno]);
 }
 
 
@@ -15141,6 +15157,11 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 		if (tbinfo->reloftype && !dopt->binary_upgrade)
 			appendPQExpBuffer(q, " OF %s", tbinfo->reloftype);
 
+		/*
+		 * Create as a partition, if partitionOf is set; except in the case
+		 * of a binary upgrade, we dump the table normally and attach it to
+		 * the parent afterward.
+		 */
 		if (tbinfo->partitionOf && !dopt->binary_upgrade)
 		{
 			TableInfo  *parentRel = tbinfo->partitionOf;
@@ -15212,11 +15233,17 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 						continue;
 					}
 
-					/* Attribute type */
+					/*
+					 * Attribute type.  If this is not a binary upgrade dump,
+					 * we need not dump the type for typed tables and
+					 * partitions.  Because in that case, we are merely
+					 * dumping column's options, not defining a new column.
+					 */
 					if ((tbinfo->reloftype || tbinfo->partitionOf) &&
 						!dopt->binary_upgrade)
 					{
-						appendPQExpBufferStr(q, " WITH OPTIONS");
+						if (tbinfo->reloftype)
+							appendPQExpBufferStr(q, " WITH OPTIONS");
 					}
 					else
 					{
@@ -15364,6 +15391,8 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 		 * using an INHERITS clause --- the latter would possibly mess up the
 		 * column order.  That also means we have to take care about setting
 		 * attislocal correctly, plus fix up any inherited CHECK constraints.
+		 * The same applies to partitions, so we set up partitions using
+		 * ALTER TABLE / ATTACH PARTITION instead of PARTITION OF.
 		 * Analogously, we set up typed tables using ALTER TABLE / OF here.
 		 */
 		if (dopt->binary_upgrade &&
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 61097e6d99..e786499006 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -294,6 +294,7 @@ typedef struct _tableInfo
 	bool		interesting;	/* true if need to collect more data */
 	bool		dummy_view;		/* view's real definition must be postponed */
 	bool		postponed_def;	/* matview must be postponed into post-data */
+	bool		relispartition;	/* is table a partition? */
 
 	/*
 	 * These fields are computed only if we decide the table is interesting
-- 
2.11.0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to