On 2018/02/28 9:46, Alvaro Herrera wrote: > I updated Amit Langote's patch for INSERT ON CONFLICT DO UPDATE[1]. > Following the lead of edd44738bc88 ("Be lazier about partition tuple > routing.") this incarnation only does the necessary push-ups for the > specific partition that needs it, at execution time. As far as I can > tell, it works as intended.
Thanks. I too have been meaning to send an updated (nay, significantly rewritten) version of the patch, but you beat me to it. > I chose to refuse the case where the DO UPDATE clause causes the tuple > to move to another partition (i.e. you're updating the partition key of > the tuple). While it's probably possible to implement that, it doesn't > seem a very productive use of time. Probably a good idea to save it for later. > However, there is a shortcoming in the design: it fails if there are > multiple levels of partitioning, because there is no easy (efficient) > way to map the index OID more than one level. I had already mentioned > this shortcoming to Amit's patch. So this case (which I added in the > regression tests) fails unexpectedly: > > -- multiple-layered partitioning > create table parted_conflict_test (a int primary key, b text) partition by > range (a); > create table parted_conflict_test_1 partition of parted_conflict_test > for values from (0) to (10000) partition by range (a); > create table parted_conflict_test_1_1 partition of parted_conflict_test_1 > for values from (0) to (100); > insert into parted_conflict_test values ('10', 'ten'); > insert into parted_conflict_test values ('10', 'ten two') > on conflict (a) do update set b = excluded.b; > ERROR: invalid ON CONFLICT DO UPDATE specification > DETAIL: An inferred index was not found in partition > "parted_conflict_test_1_1". > > So the problem here is that my MapPartitionIndexList() implementation is > too stupid. I think it would be smarter to use the ResultRelInfo > instead of bare Relation, for one. But that still doesn't answer how to > find a "path" from root to leaf partition, which is what I'd need to > verify that there are valid pg_inherits relationships for the partition > indexes. I'm probably missing something about the partitionDesc or > maybe the partitioned_rels lists that helps me do it efficiently, but I > hope figure out soon. > > One idea I was toying with is to add RelationData->rd_children as a list > of OIDs of children relations. So it should be possible to walk the > list from the root to the desired descendant, without having to scan > pg_inherits repeatedly ... although that would probably require doing > relation_open() for every index, which sounds undesirable. > > (ISTM that having RelationData->rd_children might be a good idea in > general anyway -- I mean to speed up some code that currently scans > pg_inherits via find_inheritance_children. However, since the partition > descriptor itself is already in relcache, maybe this doesn't matter too > much.) > > Another idea is to abandon the notion that we need to find a path from > parent index to descendant index, and just run the inference algorithm > again on the partition. I'm not sure how much I like this idea, yet. > > Anyway, while this is still WIP, I think it works correctly for the case > where there is a single partition level. Actually, after your comment on my original patch [1], I did make it work for multiple levels by teaching the partition initialization code to find a given partition's indexes that are inherited from the root table (that is the table mentioned in command). So, after a tuple is routed to a partition, we switch from the original arbiterIndexes list to the one we created for the partition, which must contain OIDs corresponding to those in the original list. After all, for each of the parent's indexes that the planner put into the original arbiterIndexes list, there must exist an index in each of the leaf partitions. I had also observed when working on the patch that various TupleTableSlots used by the ON CONFLICT DO UPDATE code must be based on TupleDesc of the inheritance-translated target list (DO UPDATE SET target list). In fact, that has to take into account also the dropped columns; we may have dropped columns either in parent or in a partition or in both at same or different attnum positions. That means, simple map_partition_varattnos() translation doesn't help in this case. For example, with your patch (sorry, I know you said it's a WIP patch), I see either a crash or errors when dealing with such differing attribute numbers: drop table p; create table p (a int, b text) partition by list (a); create table p12 (b text, a int); alter table p attach partition p12 for values in (1, 2); alter table p drop b, add b text; create table p4 partition of p for values in (4); create unique index on p (a); insert into p values (1, 'a') on conflict (a) do update set b = excluded.b; insert into p values (1, 'b') on conflict (a) do update set b = excluded.b; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. insert into p values (4, 'a') on conflict (a) do update set b = excluded.b; postgres=# insert into p values (4, 'b') on conflict (a) do update set b = excluded.b; ERROR: attribute number 3 exceeds number of columns 2 I attach my patch here for your reference, which I polished this morning after seeing your email and the patch. It works for most of the cases, as you can see in the updated tests in insert_conflict.sql. Since I agree with you that we should, for now, error out if DO UPDATE causes row movement, I adopted the code from your patch for that. Thanks, Amit [1] https://www.postgresql.org/message-id/20171227225031.osh6vunpuhsath25%40alvherre.pgsql
>From fab3ed1f695667b64cc037e0faa7dc7909c58915 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Wed, 28 Feb 2018 17:58:00 +0900 Subject: [PATCH v1] Fix ON CONFLICT to work with partitioned tables --- doc/src/sgml/ddl.sgml | 15 --- src/backend/catalog/heap.c | 2 +- src/backend/catalog/partition.c | 36 ++++-- src/backend/commands/tablecmds.c | 15 ++- src/backend/executor/execPartition.c | 170 +++++++++++++++++++++++++- src/backend/executor/nodeModifyTable.c | 30 +++++ src/backend/optimizer/prep/preptlist.c | 4 +- src/backend/optimizer/prep/prepunion.c | 31 +++-- src/backend/parser/analyze.c | 7 -- src/include/catalog/partition.h | 2 +- src/include/executor/execPartition.h | 17 +++ src/include/optimizer/prep.h | 11 ++ src/test/regress/expected/insert_conflict.out | 73 +++++++++-- src/test/regress/sql/insert_conflict.sql | 64 ++++++++-- 14 files changed, 405 insertions(+), 72 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 2b879ead4b..b2b3485b83 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -3325,21 +3325,6 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02 <listitem> <para> - Using the <literal>ON CONFLICT</literal> clause with partitioned tables - will cause an error if the conflict target is specified (see - <xref linkend="sql-on-conflict" /> for more details on how the clause - works). Therefore, it is not possible to specify - <literal>DO UPDATE</literal> as the alternative action, because - specifying the conflict target is mandatory in that case. On the other - hand, specifying <literal>DO NOTHING</literal> as the alternative action - works fine provided the conflict target is not specified. In that case, - unique constraints (or exclusion constraints) of the individual leaf - partitions are considered. - </para> - </listitem> - - <listitem> - <para> When an <command>UPDATE</command> causes a row to move from one partition to another, there is a chance that another concurrent <command>UPDATE</command> or <command>DELETE</command> misses this row. diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index cf36ce4add..6d49c41217 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1777,7 +1777,7 @@ heap_drop_with_catalog(Oid relid) elog(ERROR, "cache lookup failed for relation %u", relid); if (((Form_pg_class) GETSTRUCT(tuple))->relispartition) { - parentOid = get_partition_parent(relid); + parentOid = get_partition_parent(relid, false); LockRelationOid(parentOid, AccessExclusiveLock); /* diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index f8c9a11493..68e4f171ec 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -192,6 +192,7 @@ static int get_partition_bound_num_indexes(PartitionBoundInfo b); static int get_greatest_modulus(PartitionBoundInfo b); static uint64 compute_hash_value(int partnatts, FmgrInfo *partsupfunc, Datum *values, bool *isnull); +static Oid get_partition_parent_recurse(Oid relid, bool getroot); /* * RelationBuildPartitionDesc @@ -1392,14 +1393,25 @@ check_default_allows_bound(Relation parent, Relation default_rel, * when it is known that the relation is a partition. */ Oid -get_partition_parent(Oid relid) +get_partition_parent(Oid relid, bool getroot) +{ + Oid parentOid = get_partition_parent_recurse(relid, getroot); + + if (parentOid == InvalidOid) + elog(ERROR, "could not find parent of relation %u", relid); + + return parentOid; +} + +static Oid +get_partition_parent_recurse(Oid relid, bool getroot) { Form_pg_inherits form; Relation catalogRelation; SysScanDesc scan; ScanKeyData key[2]; HeapTuple tuple; - Oid result; + Oid result = InvalidOid; catalogRelation = heap_open(InheritsRelationId, AccessShareLock); @@ -1416,15 +1428,25 @@ get_partition_parent(Oid relid) NULL, 2, key); tuple = systable_getnext(scan); - if (!HeapTupleIsValid(tuple)) - elog(ERROR, "could not find tuple for parent of relation %u", relid); + if (HeapTupleIsValid(tuple)) + { + form = (Form_pg_inherits) GETSTRUCT(tuple); + result = form->inhparent; - form = (Form_pg_inherits) GETSTRUCT(tuple); - result = form->inhparent; + if (getroot) + result = get_partition_parent_recurse(result, getroot); + } systable_endscan(scan); heap_close(catalogRelation, AccessShareLock); + /* + * If we recursed and got InvalidOid as parent, that means we reached the + * root of this partition tree in the form of 'relid' itself. + */ + if (getroot && !OidIsValid(result)) + return relid; + return result; } @@ -2508,7 +2530,7 @@ generate_partition_qual(Relation rel) return copyObject(rel->rd_partcheck); /* Grab at least an AccessShareLock on the parent table */ - parent = heap_open(get_partition_parent(RelationGetRelid(rel)), + parent = heap_open(get_partition_parent(RelationGetRelid(rel), false), AccessShareLock); /* Get pg_class.relpartbound */ diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 74e020bffc..6e103f26ca 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -1292,7 +1292,7 @@ RangeVarCallbackForDropRelation(const RangeVar *rel, Oid relOid, Oid oldRelOid, */ if (is_partition && relOid != oldRelOid) { - state->partParentOid = get_partition_parent(relOid); + state->partParentOid = get_partition_parent(relOid, false); if (OidIsValid(state->partParentOid)) LockRelationOid(state->partParentOid, AccessExclusiveLock); } @@ -5843,7 +5843,8 @@ ATExecDropNotNull(Relation rel, const char *colName, LOCKMODE lockmode) /* If rel is partition, shouldn't drop NOT NULL if parent has the same */ if (rel->rd_rel->relispartition) { - Oid parentId = get_partition_parent(RelationGetRelid(rel)); + Oid parentId = get_partition_parent(RelationGetRelid(rel), + false); Relation parent = heap_open(parentId, AccessShareLock); TupleDesc tupDesc = RelationGetDescr(parent); AttrNumber parent_attnum; @@ -14346,7 +14347,7 @@ ATExecDetachPartition(Relation rel, RangeVar *name) if (!has_superclass(idxid)) continue; - Assert((IndexGetRelation(get_partition_parent(idxid), false) == + Assert((IndexGetRelation(get_partition_parent(idxid, false), false) == RelationGetRelid(rel))); idx = index_open(idxid, AccessExclusiveLock); @@ -14475,7 +14476,7 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name) /* Silently do nothing if already in the right state */ currParent = !has_superclass(partIdxId) ? InvalidOid : - get_partition_parent(partIdxId); + get_partition_parent(partIdxId, false); if (currParent != RelationGetRelid(parentIdx)) { IndexInfo *childInfo; @@ -14708,8 +14709,10 @@ validatePartitionedIndex(Relation partedIdx, Relation partedTbl) /* make sure we see the validation we just did */ CommandCounterIncrement(); - parentIdxId = get_partition_parent(RelationGetRelid(partedIdx)); - parentTblId = get_partition_parent(RelationGetRelid(partedTbl)); + parentIdxId = get_partition_parent(RelationGetRelid(partedIdx), + false); + parentTblId = get_partition_parent(RelationGetRelid(partedTbl), + false); parentIdx = relation_open(parentIdxId, AccessExclusiveLock); parentTbl = relation_open(parentTblId, AccessExclusiveLock); Assert(!parentIdx->rd_index->indisvalid); diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index 54efc9e545..3f7b61dc37 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -19,6 +19,7 @@ #include "executor/executor.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "optimizer/prep.h" #include "utils/lsyscache.h" #include "utils/rls.h" #include "utils/ruleutils.h" @@ -109,6 +110,23 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) */ proute->partition_tuple_slot = MakeTupleTableSlot(NULL); + /* + * We might need these arrays for conflict checking and handling the + * DO UPDATE action + */ + if (mtstate && mtstate->mt_onconflict != ONCONFLICT_NONE) + { + proute->partition_arbiter_indexes = (List **) + palloc(proute->num_partitions * + sizeof(List *)); + proute->partition_conflproj_slots = (TupleTableSlot **) + palloc(proute->num_partitions * + sizeof(TupleTableSlot *)); + proute->partition_existing_slots = (TupleTableSlot **) + palloc(proute->num_partitions * + sizeof(TupleTableSlot *)); + } + i = 0; foreach(cell, leaf_parts) { @@ -475,9 +493,6 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, &mtstate->ps, RelationGetDescr(partrel)); } - Assert(proute->partitions[partidx] == NULL); - proute->partitions[partidx] = leaf_part_rri; - /* * Save a tuple conversion map to convert a tuple routed to this partition * from the parent's type to the partition's. @@ -487,6 +502,155 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, RelationGetDescr(partrel), gettext_noop("could not convert row type")); + /* + * Initialize information about this partition that's needed to handle + * the ON CONFLICT clause. + */ + if (node && node->onConflictAction != ONCONFLICT_NONE) + { + TupleDesc partrelDesc = RelationGetDescr(partrel); + TupleConversionMap *map = proute->parent_child_tupconv_maps[partidx]; + TupleTableSlot *part_conflproj_slot, + *part_existing_slot; + int firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex; + Relation firstResultRel = mtstate->resultRelInfo[0].ri_RelationDesc; + ExprContext *econtext = mtstate->ps.ps_ExprContext; + ListCell *lc; + List *my_arbiterindexes = NIL; + + /* + * If the root parent and partition have the same tuple + * descriptor, just reuse the original projection and WHERE + * clause expressions for partition. + */ + if (map == NULL) + { + /* Use the existing slot. */ + part_existing_slot = mtstate->mt_existing; + part_conflproj_slot = mtstate->mt_conflproj; + leaf_part_rri->ri_onConflictSetProj = + resultRelInfo->ri_onConflictSetProj; + leaf_part_rri->ri_onConflictSetWhere = + resultRelInfo->ri_onConflictSetWhere; + } + else + { + /* Convert expressions contain partition's attnos. */ + List *conv_setproj; + AppendRelInfo appinfo; + TupleDesc tupDesc; + + /* Need our own slot. */ + part_existing_slot = + ExecInitExtraTupleSlot(mtstate->ps.state, partrelDesc); + + /* First convert references to EXCLUDED pseudo-relation. */ + conv_setproj = map_partition_varattnos((List *) + node->onConflictSet, + INNER_VAR, + partrel, + firstResultRel, NULL); + /* Then convert references to main target relation. */ + conv_setproj = map_partition_varattnos((List *) + conv_setproj, + firstVarno, + partrel, + firstResultRel, NULL); + + /* + * Need to fix the target entries' resnos too by using + * inheritance translation. + */ + appinfo.type = T_AppendRelInfo; + appinfo.parent_relid = firstVarno; + appinfo.parent_reltype = firstResultRel->rd_rel->reltype; + appinfo.child_relid = partrel->rd_id; + appinfo.child_reltype = partrel->rd_rel->reltype; + appinfo.parent_reloid = firstResultRel->rd_id; + make_inh_translation_list(firstResultRel, partrel, + 1, /* dummy */ + &appinfo.translated_vars); + conv_setproj = adjust_inherited_tlist((List *) conv_setproj, + &appinfo); + + /* + * Add any attributes that are missing in the source list, such + * as, dropped columns in the partition. + */ + conv_setproj = expand_targetlist(conv_setproj, CMD_UPDATE, + firstVarno, partrel); + + tupDesc = ExecTypeFromTL(conv_setproj, partrelDesc->tdhasoid); + part_conflproj_slot = ExecInitExtraTupleSlot(mtstate->ps.state, + tupDesc); + leaf_part_rri->ri_onConflictSetProj = + ExecBuildProjectionInfo(conv_setproj, econtext, + part_conflproj_slot, + &mtstate->ps, partrelDesc); + + /* For WHERE quals, map_partition_varattnos() suffices. */ + if (node->onConflictWhere) + { + List *conv_where; + ExprState *qualexpr; + + /* First convert references to EXCLUDED pseudo-relation. */ + conv_where = map_partition_varattnos((List *) + node->onConflictWhere, + INNER_VAR, + partrel, + firstResultRel, NULL); + /* Then convert references to main target relation. */ + conv_where = map_partition_varattnos((List *) + conv_where, + firstVarno, + partrel, firstResultRel, + NULL); + qualexpr = ExecInitQual(conv_where, &mtstate->ps); + leaf_part_rri->ri_onConflictSetWhere = qualexpr; + } + } + + /* + * Save away for use later. Set mtstate->mt_existing and + * mtstate->mt_conflproj, respectively, to these values before + * calling ExecOnConflictUpdate(). + */ + proute->partition_existing_slots[partidx] = part_existing_slot; + proute->partition_conflproj_slots[partidx] = part_conflproj_slot; + + /* Initialize arbiter indexes list, if any. */ + foreach(lc, mtstate->mt_arbiterindexes) + { + Oid parentArbiterIndexOid = lfirst_oid(lc); + int i; + + /* + * Find parentArbiterIndexOid's child in this partition and add it + * to my_arbiterindexes. + */ + for (i = 0; i < leaf_part_rri->ri_NumIndices; i++) + { + Relation index = leaf_part_rri->ri_IndexRelationDescs[i]; + Oid indexOid = RelationGetRelid(index); + + if (parentArbiterIndexOid == + get_partition_parent(indexOid, true)) + my_arbiterindexes = lappend_oid(my_arbiterindexes, + indexOid); + } + } + + /* + * Use this list instead of the original one containing parent's + * indexes. + */ + proute->partition_arbiter_indexes[partidx] = my_arbiterindexes; + } + + Assert(proute->partitions[partidx] == NULL); + proute->partitions[partidx] = leaf_part_rri; + MemoryContextSwitchTo(oldContext); return leaf_part_rri; diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index c32928d9bd..0f9ca6586e 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -374,6 +374,24 @@ ExecInsert(ModifyTableState *mtstate, tuple, proute->partition_tuple_slot, &slot); + + /* Switch to partition's ON CONFLICT information. */ + if (arbiterIndexes) + { + Assert(onconflict != ONCONFLICT_NONE); + arbiterIndexes = proute->partition_arbiter_indexes[leaf_part_index]; + + /* Use correct existing and projection slots for DO UPDATE */ + if (onconflict == ONCONFLICT_UPDATE) + { + Assert(proute->partition_existing_slots[leaf_part_index]); + mtstate->mt_existing = + proute->partition_existing_slots[leaf_part_index]; + Assert(proute->partition_conflproj_slots[leaf_part_index]); + mtstate->mt_conflproj = + proute->partition_conflproj_slots[leaf_part_index]; + } + } } resultRelationDesc = resultRelInfo->ri_RelationDesc; @@ -1146,6 +1164,18 @@ lreplace:; TupleConversionMap *tupconv_map; /* + * Disallow an INSERT ON CONFLICT DO UPDATE that causes the + * original row to migrate to a different partition. Maybe this + * can be implemented some day, but it seems a fringe feature with + * little redeeming value. + */ + if (((ModifyTable *) mtstate->ps.plan)->onConflictAction == ONCONFLICT_UPDATE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("invalid ON UPDATE specification"), + errdetail("The result tuple would appear in a different partition than the original tuple."))); + + /* * When an UPDATE is run on a leaf partition, we will not have * partition tuple routing set up. In that case, fail with * partition constraint violation error. diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c index 8603feef2b..f5ba93db4a 100644 --- a/src/backend/optimizer/prep/preptlist.c +++ b/src/backend/optimizer/prep/preptlist.c @@ -53,8 +53,6 @@ #include "utils/rel.h" -static List *expand_targetlist(List *tlist, int command_type, - Index result_relation, Relation rel); /* @@ -251,7 +249,7 @@ preprocess_targetlist(PlannerInfo *root) * add targetlist entries for any missing attributes, and ensure the * non-junk attributes appear in proper field order. */ -static List * +List * expand_targetlist(List *tlist, int command_type, Index result_relation, Relation rel) { diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index b586f941a8..4bd72026f0 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -113,18 +113,12 @@ static void expand_single_inheritance_child(PlannerInfo *root, PlanRowMark *top_parentrc, Relation childrel, List **appinfos, RangeTblEntry **childrte_p, Index *childRTindex_p); -static void make_inh_translation_list(Relation oldrelation, - Relation newrelation, - Index newvarno, - List **translated_vars); static Bitmapset *translate_col_privs(const Bitmapset *parent_privs, List *translated_vars); static Node *adjust_appendrel_attrs_mutator(Node *node, adjust_appendrel_attrs_context *context); static Relids adjust_child_relids(Relids relids, int nappinfos, AppendRelInfo **appinfos); -static List *adjust_inherited_tlist(List *tlist, - AppendRelInfo *context); /* @@ -1793,7 +1787,7 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, * * For paranoia's sake, we match type/collation as well as attribute name. */ -static void +void make_inh_translation_list(Relation oldrelation, Relation newrelation, Index newvarno, List **translated_vars) @@ -2353,7 +2347,7 @@ adjust_child_relids_multilevel(PlannerInfo *root, Relids relids, * * Note that this is not needed for INSERT because INSERT isn't inheritable. */ -static List * +List * adjust_inherited_tlist(List *tlist, AppendRelInfo *context) { bool changed_it = false; @@ -2374,6 +2368,13 @@ adjust_inherited_tlist(List *tlist, AppendRelInfo *context) if (tle->resjunk) continue; /* ignore junk items */ + /* + * ignore dummy tlist entry added by exapnd_targetlist() for + * dropped columns in the parent table. + */ + if (IsA(tle->expr, Const) && ((Const *) tle->expr)->constisnull) + continue; + /* Look up the translation of this column: it must be a Var */ if (tle->resno <= 0 || tle->resno > list_length(context->translated_vars)) @@ -2412,6 +2413,13 @@ adjust_inherited_tlist(List *tlist, AppendRelInfo *context) if (tle->resjunk) continue; /* ignore junk items */ + /* + * ignore dummy tlist entry added by exapnd_targetlist() for + * dropped columns in the parent table. + */ + if (IsA(tle->expr, Const) && ((Const *) tle->expr)->constisnull) + continue; + if (tle->resno == attrno) new_tlist = lappend(new_tlist, tle); else if (tle->resno > attrno) @@ -2426,6 +2434,13 @@ adjust_inherited_tlist(List *tlist, AppendRelInfo *context) if (!tle->resjunk) continue; /* here, ignore non-junk items */ + /* + * ignore dummy tlist entry added by exapnd_targetlist() for + * dropped columns in the parent table. + */ + if (IsA(tle->expr, Const) && ((Const *) tle->expr)->constisnull) + continue; + tle->resno = attrno; new_tlist = lappend(new_tlist, tle); attrno++; diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index c3a9617f67..92696f0607 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -1025,13 +1025,6 @@ transformOnConflictClause(ParseState *pstate, TargetEntry *te; int attno; - if (targetrel->rd_partdesc) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("%s cannot be applied to partitioned table \"%s\"", - "ON CONFLICT DO UPDATE", - RelationGetRelationName(targetrel)))); - /* * All INSERT expressions have been parsed, get ready for potentially * existing SET statements that need to be processed like an UPDATE. diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h index 2faf0ca26e..70ddb225a1 100644 --- a/src/include/catalog/partition.h +++ b/src/include/catalog/partition.h @@ -51,7 +51,7 @@ extern PartitionBoundInfo partition_bounds_copy(PartitionBoundInfo src, extern void check_new_partition_bound(char *relname, Relation parent, PartitionBoundSpec *spec); -extern Oid get_partition_parent(Oid relid); +extern Oid get_partition_parent(Oid relid, bool getroot); extern List *get_qual_from_partbound(Relation rel, Relation parent, PartitionBoundSpec *spec); extern List *map_partition_varattnos(List *expr, int fromrel_varno, diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h index 03a599ad57..9fc1ab6711 100644 --- a/src/include/executor/execPartition.h +++ b/src/include/executor/execPartition.h @@ -90,6 +90,20 @@ typedef struct PartitionDispatchData *PartitionDispatch; * given leaf partition's rowtype after that * partition is chosen for insertion by * tuple-routing. + * partition_arbiter_indexes Array of Lists with each slot containing the + * list of OIDs of a given partition's indexes + * that are to be used as arbiter indexes for + * ON CONFLICT checking + * partition_conflproj_slots Array of TupleTableSlots to hold tuples of + * ON CONFLICT DO UPDATE SET projections; + * contains NULL for partitions whose tuple + * descriptor exactly matches the root parent's + * (including dropped columns) + * partition_existing_slots Array of TupleTableSlots to hold existing + * tuple during ON CONFLICT DO UPDATE handling; + * contains NULL for partitions whose tuple + * descriptor exactly matches the root parent's + * (including dropped columns) *----------------------- */ typedef struct PartitionTupleRouting @@ -106,6 +120,9 @@ typedef struct PartitionTupleRouting int num_subplan_partition_offsets; TupleTableSlot *partition_tuple_slot; TupleTableSlot *root_tuple_slot; + List **partition_arbiter_indexes; + TupleTableSlot **partition_conflproj_slots; + TupleTableSlot **partition_existing_slots; } PartitionTupleRouting; extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, diff --git a/src/include/optimizer/prep.h b/src/include/optimizer/prep.h index 89b7ef337c..d380b419d7 100644 --- a/src/include/optimizer/prep.h +++ b/src/include/optimizer/prep.h @@ -42,6 +42,10 @@ extern List *preprocess_targetlist(PlannerInfo *root); extern PlanRowMark *get_plan_rowmark(List *rowmarks, Index rtindex); +typedef struct RelationData *Relation; +extern List *expand_targetlist(List *tlist, int command_type, + Index result_relation, Relation rel); + /* * prototypes for prepunion.c */ @@ -65,4 +69,11 @@ extern SpecialJoinInfo *build_child_join_sjinfo(PlannerInfo *root, extern Relids adjust_child_relids_multilevel(PlannerInfo *root, Relids relids, Relids child_relids, Relids top_parent_relids); +extern void make_inh_translation_list(Relation oldrelation, + Relation newrelation, + Index newvarno, + List **translated_vars); +extern List *adjust_inherited_tlist(List *tlist, + AppendRelInfo *context); + #endif /* PREP_H */ diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out index 2650faedee..a9677f06e6 100644 --- a/src/test/regress/expected/insert_conflict.out +++ b/src/test/regress/expected/insert_conflict.out @@ -786,16 +786,67 @@ select * from selfconflict; (3 rows) drop table selfconflict; --- check that the following works: --- insert into partitioned_table on conflict do nothing -create table parted_conflict_test (a int, b char) partition by list (a); -create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1); +-- check ON CONFLICT handling with partitioned tables +create table parted_conflict_test (a int unique, b char) partition by list (a); +create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1, 2); +-- no indexes required here insert into parted_conflict_test values (1, 'a') on conflict do nothing; -insert into parted_conflict_test values (1, 'a') on conflict do nothing; --- however, on conflict do update is not supported yet -insert into parted_conflict_test values (1) on conflict (b) do update set a = excluded.a; -ERROR: ON CONFLICT DO UPDATE cannot be applied to partitioned table "parted_conflict_test" --- but it works OK if we target the partition directly -insert into parted_conflict_test_1 values (1) on conflict (b) do -update set a = excluded.a; +-- index on a required, which does exist in parent +insert into parted_conflict_test values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test values (1, 'a') on conflict (a) do update set b = excluded.b; +-- targeting partition directly will work +insert into parted_conflict_test_1 values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test_1 values (1, 'b') on conflict (a) do update set b = excluded.b; +-- index on b required, which doesn't exist in parent +insert into parted_conflict_test values (2, 'b') on conflict (b) do update set a = excluded.a; +ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification +-- targeting partition directly will work +insert into parted_conflict_test_1 values (2, 'b') on conflict (b) do update set a = excluded.a; +-- should see (2, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 2 | b +(1 row) + +-- now check that DO UPDATE works correctly for target partition with +-- different attribute numbers +create table parted_conflict_test_2 (b char, a int unique); +alter table parted_conflict_test attach partition parted_conflict_test_2 for values in (3); +truncate parted_conflict_test; +insert into parted_conflict_test values (3, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test values (3, 'b') on conflict (a) do update set b = excluded.b; +-- should see (3, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 3 | b +(1 row) + +-- case where parent will have a dropped column, but the partition won't +alter table parted_conflict_test drop b, add b char; +create table parted_conflict_test_3 partition of parted_conflict_test for values in (4); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (4, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (4, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; +-- should see (4, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 4 | b +(1 row) + +-- case with multi-level partitioning +create table parted_conflict_test_4 partition of parted_conflict_test for values in (5) partition by list (a); +create table parted_conflict_test_4_1 partition of parted_conflict_test_4 for values in (5); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (5, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (5, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; +-- should see (5, 'b') +select * from parted_conflict_test order by a; + a | b +---+--- + 5 | b +(1 row) + drop table parted_conflict_test; diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql index 32c647e3f8..73122479a3 100644 --- a/src/test/regress/sql/insert_conflict.sql +++ b/src/test/regress/sql/insert_conflict.sql @@ -472,15 +472,59 @@ select * from selfconflict; drop table selfconflict; --- check that the following works: --- insert into partitioned_table on conflict do nothing -create table parted_conflict_test (a int, b char) partition by list (a); -create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1); +-- check ON CONFLICT handling with partitioned tables +create table parted_conflict_test (a int unique, b char) partition by list (a); +create table parted_conflict_test_1 partition of parted_conflict_test (b unique) for values in (1, 2); + +-- no indexes required here insert into parted_conflict_test values (1, 'a') on conflict do nothing; -insert into parted_conflict_test values (1, 'a') on conflict do nothing; --- however, on conflict do update is not supported yet -insert into parted_conflict_test values (1) on conflict (b) do update set a = excluded.a; --- but it works OK if we target the partition directly -insert into parted_conflict_test_1 values (1) on conflict (b) do -update set a = excluded.a; + +-- index on a required, which does exist in parent +insert into parted_conflict_test values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test values (1, 'a') on conflict (a) do update set b = excluded.b; + +-- targeting partition directly will work +insert into parted_conflict_test_1 values (1, 'a') on conflict (a) do nothing; +insert into parted_conflict_test_1 values (1, 'b') on conflict (a) do update set b = excluded.b; + +-- index on b required, which doesn't exist in parent +insert into parted_conflict_test values (2, 'b') on conflict (b) do update set a = excluded.a; + +-- targeting partition directly will work +insert into parted_conflict_test_1 values (2, 'b') on conflict (b) do update set a = excluded.a; + +-- should see (2, 'b') +select * from parted_conflict_test order by a; + +-- now check that DO UPDATE works correctly for target partition with +-- different attribute numbers +create table parted_conflict_test_2 (b char, a int unique); +alter table parted_conflict_test attach partition parted_conflict_test_2 for values in (3); +truncate parted_conflict_test; +insert into parted_conflict_test values (3, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test values (3, 'b') on conflict (a) do update set b = excluded.b; + +-- should see (3, 'b') +select * from parted_conflict_test order by a; + +-- case where parent will have a dropped column, but the partition won't +alter table parted_conflict_test drop b, add b char; +create table parted_conflict_test_3 partition of parted_conflict_test for values in (4); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (4, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (4, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; + +-- should see (4, 'b') +select * from parted_conflict_test order by a; + +-- case with multi-level partitioning +create table parted_conflict_test_4 partition of parted_conflict_test for values in (5) partition by list (a); +create table parted_conflict_test_4_1 partition of parted_conflict_test_4 for values in (5); +truncate parted_conflict_test; +insert into parted_conflict_test (a, b) values (5, 'a') on conflict (a) do update set b = excluded.b; +insert into parted_conflict_test (a, b) values (5, 'b') on conflict (a) do update set b = excluded.b where parted_conflict_test.b = 'a'; + +-- should see (5, 'b') +select * from parted_conflict_test order by a; + drop table parted_conflict_test; -- 2.11.0