Hi hackers,

If you drop a column from a partitioned table then it has a TupleDesc
that matches existing partitions, but new partitions created after
that have non-same TupleDescs (according to convert_tuples_by_name)
because they don't have the dropped column.  That means that inserts
to partitions created later need to go via the deform->remap->form
code path in tupconvert.c.  If you're using a time-based partitioning
scheme where you add a new partition for each month and mostly insert
into the current month, as is very common, then after dropping a
column you'll eventually finish up sending ALL your inserts through
tupconvert.c for the rest of time.

For example, having hacked my tree to print out a message to tell me
if it had to convert a tuple:

postgres=# create table parent (a int, b int) partition by list (b);
CREATE TABLE
postgres=# create table child1 partition of parent for values in (1);
CREATE TABLE
postgres=# alter table parent drop column a;
ALTER TABLE
postgres=# create table child2 partition of parent for values in (2);
CREATE TABLE
postgres=# insert into parent values (1);
NOTICE:  no map
INSERT 0 1
postgres=# insert into parent values (2);
NOTICE:  map!
INSERT 0 1

Of course there are other usage patterns where you might prefer it
this way, because you'll mostly be inserting into partitions created
before the change.  In general, would it be better for the partitioned
table's TupleDesc to match partitions created before or after a
change?  Since partitioned tables have no storage themselves, is there
any technical reason we couldn't remove a partitioned table's dropped
pg_attribute so that its TupleDesc matches partitions created later?
Is there some way that tupconvert.c could make this type of difference
moot?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to