On 2017/09/08 4:04, Robert Haas wrote: > On Tue, Sep 5, 2017 at 7:01 AM, Ashutosh Bapat > <ashutosh.ba...@enterprisedb.com> wrote: >> accumulate_append_subpath() is executed for every path instead of >> every relation, so changing it would collect the same list multiple >> times. Instead, I found the old way of associating all intermediate >> partitions with the root partitioned relation work better. Here's the >> updated patch set. > > When I tried out patch 0001, it crashed repeatedly during 'make check' > because of an assertion failure in get_partitioned_child_rels. It > seemed to me that the way the patch was refactoring > expand_inherited_rtentry involved more code rearrangement than > necessary, so I reverted all the code rearrangement and just kept the > functional changes, and all the crashes went away. (That refactoring > also failed to initialize has_child properly.)
When I tried the attached patch, it doesn't seem to expand partitioning inheritance in step-wise manner as the patch's title says. I think the rewritten patch forgot to include Ashutosh's changes to expand_single_inheritance_child() whereby the AppendRelInfo of the child will be marked with the direct parent instead of always the root parent. I updated the patch to include just those changes. I'm not sure about one of the Ashutosh's changes whereby the child PlanRowMark is also passed to expand_partitioned_rtentry() to use as the parent PlanRowMark. I think the child RTE, child RT index and child Relation are fine, because they are necessary for creating AppendRelInfos in a desired way for later planning steps. But PlanRowMarks are not processed within the planner afterwards and do not need to be marked with the immediate parent-child association in the same way that AppendRelInfos need to be. I also included the changes to add_paths_to_append_rel() from my patch on the "path toward faster partition pruning" thread. We'd need that change, because while add_paths_to_append_rel() is called for all partitioned table RTEs in a given partition tree, expand_inherited_rtentry() would have set up a PartitionedChildRelInfo only for the root parent, so get_partitioned_child_rels() would not find the same for non-root partitioned table rels and crash failing the Assert. The changes I made are such that we call get_partitioned_child_rels() only for the parent rels that are known to correspond root partitioned tables (or as you pointed out on the thread, "the table named in the query" as opposed those added to the query as result of inheritance expansion). In addition to the relkind check on the input RTE, it would seem that checking that the reloptkind is RELOPT_BASEREL would be enough. But actually not, because if a partitioned table is accessed in a UNION ALL query, reloptkind even for the root partitioned table (the table named in the query) would be RELOPT_OTHER_MEMBER_REL. The only way to confirm that the input rel is actually the root partitioned table is to check whether its parent rel is not RTE_RELATION, because the parent in case of UNION ALL Append is a RTE_SUBQUERY RT entry. > One thing I notice is that if I rip out the changes to initsplan.c, > the new regression test still passes. If it's possible to write a > test that fails without those changes, I think it would be a good idea > to include one in the patch. That's certainly one of the subtler > parts of this patch, IMHO. Back when this (step-wise expansion of partition inheritance) used to be a patch in the original declarative partitioning patch series, Ashutosh had reported a test query [1] that would fail getting a plan, for which we came up with the initsplan.c changes in this patch as the solution: ERROR: could not devise a query plan for the given query I tried that query again without the initsplan.c changes and somehow the same error does not occur anymore. It's strange because without the initsplan.c changes, there is no way for partitions lower in the tree than the first level to get the direct_lateral_relids and lateral_relids from the root parent rel. Maybe, Ashutosh has a way to devise the failing query again. I also confirmed that the partition-pruning patch set works fine with this patch instead of the patch on that thread with the same functionality, which I will now drop from that patch set. Sorry about the wasted time. Thanks, Amit [1] https://www.postgresql.org/message-id/CAFjFpReZF34MDbY95xoATi0xVj2mAry4-LHBWVBayOc8gj%3Diqg%40mail.gmail.com
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 2d7e1d84d0..71b5bdf95e 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -24,6 +24,7 @@ #include "catalog/pg_operator.h" #include "catalog/pg_proc.h" #include "foreign/fdwapi.h" +#include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" #ifdef OPTIMIZER_DEBUG @@ -867,6 +868,9 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, int nattrs; ListCell *l; + /* Guard against stack overflow due to overly deep inheritance tree. */ + check_stack_depth(); + Assert(IS_SIMPLE_REL(rel)); /* @@ -1289,11 +1293,42 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte; rte = planner_rt_fetch(rel->relid, root); + + /* + * Get the partitioned_rels list from root->pcinfo_list after + * confirming that rel is actually the table named in the query, + * instead of a partitioned table that was added as a result of + * inheritance expansion, because only the former gets a + * PartitionedChildRelInfo. + */ if (rte->relkind == RELKIND_PARTITIONED_TABLE) { - partitioned_rels = get_partitioned_child_rels(root, rel->relid); - /* The root partitioned table is included as a child rel */ - Assert(list_length(partitioned_rels) >= 1); + int parent_relid; + bool get_pcinfo = false; + + /* + * To distinguish the partitioned table rels added as result + * of inheritance expansion, check using reloptkind if it's + * otherrel. But the original table could also be an otherrel, + * if it's a child of a UNION ALL all query. + */ + if (!IS_OTHER_REL(rel)) + get_pcinfo = true; + else if (bms_get_singleton_member(rel->top_parent_relids, + &parent_relid)) + { + RelOptInfo *parent_rel; + + parent_rel = root->simple_rel_array[parent_relid]; + get_pcinfo = (parent_rel->rtekind == RTE_SUBQUERY); + } + + if (get_pcinfo) + { + partitioned_rels = get_partitioned_child_rels(root, rel->relid); + /* The root partitioned table is included as a child rel */ + Assert(list_length(partitioned_rels) >= 1); + } } /* diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index 987c20ac9f..ad81f0f82f 100644 --- a/src/backend/optimizer/plan/initsplan.c +++ b/src/backend/optimizer/plan/initsplan.c @@ -15,6 +15,7 @@ #include "postgres.h" #include "catalog/pg_type.h" +#include "catalog/pg_class.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" @@ -629,11 +630,28 @@ create_lateral_join_info(PlannerInfo *root) for (rti = 1; rti < root->simple_rel_array_size; rti++) { RelOptInfo *brel = root->simple_rel_array[rti]; + RangeTblEntry *brte = root->simple_rte_array[rti]; - if (brel == NULL || brel->reloptkind != RELOPT_BASEREL) + if (brel == NULL) + continue; + + /* + * In the case of table inheritance, the parent RTE is directly linked + * to every child table via an AppendRelInfo. In the case of table + * partitioning, the inheritance hierarchy is expanded one level at a + * time rather than flattened. Therefore, an other member rel that is + * a partitioned table may have children of its own, and must + * therefore be marked with the appropriate lateral info so that those + * children eventually get marked also. + */ + Assert(IS_SIMPLE_REL(brel)); + Assert(brte); + if (brel->reloptkind == RELOPT_OTHER_MEMBER_REL && + (brte->rtekind != RTE_RELATION || + brte->relkind != RELKIND_PARTITIONED_TABLE)) continue; - if (root->simple_rte_array[rti]->inh) + if (brte->inh) { foreach(lc, root->append_rel_list) { diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 6b79b3ad99..82b722b47b 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -1038,7 +1038,7 @@ static void inheritance_planner(PlannerInfo *root) { Query *parse = root->parse; - int parentRTindex = parse->resultRelation; + int top_parentRTindex = parse->resultRelation; Bitmapset *subqueryRTindexes; Bitmapset *modifiableARIindexes; int nominalRelation = -1; @@ -1056,6 +1056,10 @@ inheritance_planner(PlannerInfo *root) Index rti; RangeTblEntry *parent_rte; List *partitioned_rels = NIL; + PlannerInfo *parent_root; + Query *parent_parse; + Bitmapset *parent_relids = bms_make_singleton(top_parentRTindex); + PlannerInfo **parent_roots = NULL; Assert(parse->commandType != CMD_INSERT); @@ -1121,9 +1125,22 @@ inheritance_planner(PlannerInfo *root) * opposite in the case of non-partitioned inheritance parent as described * below. */ - parent_rte = rt_fetch(parentRTindex, root->parse->rtable); + parent_rte = rt_fetch(top_parentRTindex, root->parse->rtable); if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE) - nominalRelation = parentRTindex; + nominalRelation = top_parentRTindex; + + /* + * The PlannerInfo for each child is obtained by translating the relevant + * members of the PlannerInfo for its immediate parent, which we find + * using the parent_relid in its AppendRelInfo. We save the PlannerInfo + * for each parent in an array indexed by relid for fast retrieval. Since + * the maximum number of parents is limited by the number of RTEs in the + * query, we use that number to allocate the array. An extra entry is + * needed since relids start from 1. + */ + parent_roots = (PlannerInfo **) palloc0((list_length(parse->rtable) + 1) * + sizeof(PlannerInfo *)); + parent_roots[top_parentRTindex] = root; /* * And now we can get on with generating a plan for each child table. @@ -1137,15 +1154,24 @@ inheritance_planner(PlannerInfo *root) Path *subpath; /* append_rel_list contains all append rels; ignore others */ - if (appinfo->parent_relid != parentRTindex) + if (!bms_is_member(appinfo->parent_relid, parent_relids)) continue; /* + * expand_inherited_rtentry() always processes a parent before any of + * that parent's children, so the parent_root for this relation should + * already be available. + */ + parent_root = parent_roots[appinfo->parent_relid]; + Assert(parent_root != NULL); + parent_parse = parent_root->parse; + + /* * We need a working copy of the PlannerInfo so that we can control * propagation of information back to the main copy. */ subroot = makeNode(PlannerInfo); - memcpy(subroot, root, sizeof(PlannerInfo)); + memcpy(subroot, parent_root, sizeof(PlannerInfo)); /* * Generate modified query with this rel as target. We first apply @@ -1154,15 +1180,15 @@ inheritance_planner(PlannerInfo *root) * then fool around with subquery RTEs. */ subroot->parse = (Query *) - adjust_appendrel_attrs(root, - (Node *) parse, + adjust_appendrel_attrs(parent_root, + (Node *) parent_parse, 1, &appinfo); /* * If there are securityQuals attached to the parent, move them to the * child rel (they've already been transformed properly for that). */ - parent_rte = rt_fetch(parentRTindex, subroot->parse->rtable); + parent_rte = rt_fetch(appinfo->parent_relid, subroot->parse->rtable); child_rte = rt_fetch(appinfo->child_relid, subroot->parse->rtable); child_rte->securityQuals = parent_rte->securityQuals; parent_rte->securityQuals = NIL; @@ -1173,7 +1199,7 @@ inheritance_planner(PlannerInfo *root) * executor doesn't need to see the modified copies --- we can just * pass it the original rowMarks list.) */ - subroot->rowMarks = copyObject(root->rowMarks); + subroot->rowMarks = copyObject(parent_root->rowMarks); /* * The append_rel_list likewise might contain references to subquery @@ -1190,7 +1216,7 @@ inheritance_planner(PlannerInfo *root) ListCell *lc2; subroot->append_rel_list = NIL; - foreach(lc2, root->append_rel_list) + foreach(lc2, parent_root->append_rel_list) { AppendRelInfo *appinfo2 = lfirst_node(AppendRelInfo, lc2); @@ -1225,7 +1251,7 @@ inheritance_planner(PlannerInfo *root) ListCell *lr; rti = 1; - foreach(lr, parse->rtable) + foreach(lr, parent_parse->rtable) { RangeTblEntry *rte = lfirst_node(RangeTblEntry, lr); @@ -1272,6 +1298,22 @@ inheritance_planner(PlannerInfo *root) /* hack to mark target relation as an inheritance partition */ subroot->hasInheritedTarget = true; + /* + * If the child is further partitioned, remember it as a parent. Since + * a partitioned table does not have any data, we don't need to create + * a plan for it. We do, however, need to remember the PlannerInfo for + * use when processing its children. + */ + if (child_rte->inh) + { + Assert(child_rte->relkind == RELKIND_PARTITIONED_TABLE); + parent_relids = + bms_add_member(parent_relids, appinfo->child_relid); + parent_roots[appinfo->child_relid] = subroot; + + continue; + } + /* Generate Path(s) for accessing this result relation */ grouping_planner(subroot, true, 0.0 /* retrieve all tuples */ ); @@ -1370,7 +1412,7 @@ inheritance_planner(PlannerInfo *root) if (parent_rte->relkind == RELKIND_PARTITIONED_TABLE) { - partitioned_rels = get_partitioned_child_rels(root, parentRTindex); + partitioned_rels = get_partitioned_child_rels(root, top_parentRTindex); /* The root partitioned table is included as a child rel */ Assert(list_length(partitioned_rels) >= 1); } diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c index ccf21453fd..95e12a5207 100644 --- a/src/backend/optimizer/prep/prepunion.c +++ b/src/backend/optimizer/prep/prepunion.c @@ -113,7 +113,9 @@ static void expand_single_inheritance_child(PlannerInfo *root, Index parentRTindex, Relation parentrel, PlanRowMark *parentrc, Relation childrel, bool *has_child, List **appinfos, - List **partitioned_child_rels); + List **partitioned_child_rels, + RangeTblEntry **childrte_p, + Index *childRTindex_p); static void make_inh_translation_list(Relation oldrelation, Relation newrelation, Index newvarno, @@ -1348,9 +1350,9 @@ expand_inherited_tables(PlannerInfo *root) ListCell *rl; /* - * expand_inherited_rtentry may add RTEs to parse->rtable; there is no - * need to scan them since they can't have inh=true. So just scan as far - * as the original end of the rtable list. + * expand_inherited_rtentry may add RTEs to parse->rtable. The function is + * expected to recursively handle any RTEs that it creates with inh=true. + * So just scan as far as the original end of the rtable list. */ nrtes = list_length(root->parse->rtable); rl = list_head(root->parse->rtable); @@ -1479,7 +1481,8 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) expand_single_inheritance_child(root, rte, rti, oldrelation, oldrc, oldrelation, &has_child, &appinfos, - &partitioned_child_rels); + &partitioned_child_rels, + NULL, NULL); expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc, RelationGetPartitionDesc(oldrelation), lockmode, @@ -1519,7 +1522,8 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) expand_single_inheritance_child(root, rte, rti, oldrelation, oldrc, newrelation, &has_child, &appinfos, - &partitioned_child_rels); + &partitioned_child_rels, + NULL, NULL); /* Close child relations, but keep locks */ if (childOID != parentOID) @@ -1581,6 +1585,8 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte, { Oid childOID = partdesc->oids[i]; Relation childrel; + RangeTblEntry *childrte; + Index childRTindex; /* Open rel; we already have required locks */ childrel = heap_open(childOID, NoLock); @@ -1595,16 +1601,25 @@ expand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte, expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel, parentrc, childrel, has_child, appinfos, - partitioned_child_rels); + partitioned_child_rels, + &childrte, &childRTindex); - /* If this child is itself partitioned, recurse */ + /* + * If this child is itself partitioned, recurse. Pass down the + * childrte as the parent of the child RTEs that will be created in + * the following call to ensure that the AppendRelInfos thus created + * for the children will be marked with the immediate parent instead + * of the root parent. + */ if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) - expand_partitioned_rtentry(root, parentrte, parentRTindex, - parentrel, parentrc, + { + expand_partitioned_rtentry(root, childrte, childRTindex, + childrel, parentrc, RelationGetPartitionDesc(childrel), lockmode, has_child, appinfos, partitioned_child_rels); + } /* Close child relation, but keep locks */ heap_close(childrel, NoLock); @@ -1625,7 +1640,9 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, Index parentRTindex, Relation parentrel, PlanRowMark *parentrc, Relation childrel, bool *has_child, List **appinfos, - List **partitioned_child_rels) + List **partitioned_child_rels, + RangeTblEntry **childrte_p, + Index *childRTindex_p) { Query *parse = root->parse; Oid parentOID = RelationGetRelid(parentrel); @@ -1649,17 +1666,25 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, childrte = copyObject(parentrte); childrte->relid = childOID; childrte->relkind = childrel->rd_rel->relkind; - childrte->inh = false; + /* A partitioned child will need to be expanded further. */ + if (childOID != parentOID && + childrte->relkind == RELKIND_PARTITIONED_TABLE) + childrte->inh = true; + else + childrte->inh = false; childrte->requiredPerms = 0; childrte->securityQuals = NIL; parse->rtable = lappend(parse->rtable, childrte); childRTindex = list_length(parse->rtable); /* - * Build an AppendRelInfo for this parent and child, unless the child is a - * partitioned table. + * We need an AppendRelInfo if paths will be built for the child RTE. + * If childrte->inh is true, then we'll always need to generate append + * paths for it. If childrte->inh is false, we must scan it if it's + * not a partitioned table; but if it is a partitioned table, then it + * never has any data of its own and need not be scanned. */ - if (childrte->relkind != RELKIND_PARTITIONED_TABLE) + if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh) { /* Remember if we saw a real child. */ if (childOID != parentOID) @@ -1694,7 +1719,12 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, appinfo->translated_vars); } } - else + + /* + * If this is a partitioned table, it won't be scanned; add it to the + * list of partitioned child relations so that it gets properly locked. + */ + if (childrte->relkind == RELKIND_PARTITIONED_TABLE) *partitioned_child_rels = lappend_int(*partitioned_child_rels, childRTindex); @@ -1704,7 +1734,6 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, if (parentrc) { PlanRowMark *childrc = makeNode(PlanRowMark); - childrc->rti = childRTindex; childrc->prti = parentRTindex; childrc->rowmarkId = parentrc->rowmarkId; @@ -1726,6 +1755,11 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, root->rowMarks = lappend(root->rowMarks, childrc); } + + if (childrte_p) + *childrte_p = childrte; + if (childRTindex_p) + *childRTindex_p = childRTindex; } /* diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index a39e59d8ac..d50ff55681 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -1935,10 +1935,10 @@ typedef struct SpecialJoinInfo * * When we expand an inheritable table or a UNION-ALL subselect into an * "append relation" (essentially, a list of child RTEs), we build an - * AppendRelInfo for each non-partitioned child RTE. The list of - * AppendRelInfos indicates which child RTEs must be included when expanding - * the parent, and each node carries information needed to translate Vars - * referencing the parent into Vars referencing that child. + * AppendRelInfo for each child RTE. The list of AppendRelInfos indicates + * which child RTEs must be included when expanding the parent, and each node + * carries information needed to translate Vars referencing the parent into + * Vars referencing that child. * * These structs are kept in the PlannerInfo node's append_rel_list. * Note that we just throw all the structs into one list, and scan the diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out index 1fa9650ec9..2fb0b4d86e 100644 --- a/src/test/regress/expected/inherit.out +++ b/src/test/regress/expected/inherit.out @@ -625,6 +625,28 @@ select tableoid::regclass::text as relname, parted_tab.* from parted_tab order b (3 rows) drop table parted_tab; +-- Check UPDATE with multi-level partitioned inherited target +create table mlparted_tab (a int, b char, c text) partition by list (a); +create table mlparted_tab_part1 partition of mlparted_tab for values in (1); +create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b); +create table mlparted_tab_part3 partition of mlparted_tab for values in (3); +create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a'); +create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b'); +insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'); +update mlparted_tab mlp set c = 'xxx' +from + (select a from some_tab union all select a+1 from some_tab) ss (a) +where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3; +select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2; + relname | a | b | c +---------------------+---+---+----- + mlparted_tab_part1 | 1 | a | + mlparted_tab_part2a | 2 | a | + mlparted_tab_part2b | 2 | b | xxx + mlparted_tab_part3 | 3 | a | xxx +(4 rows) + +drop table mlparted_tab; drop table some_tab cascade; NOTICE: drop cascades to table some_tab_child /* Test multiple inheritance of column defaults */ diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql index c96580cd81..01780d4977 100644 --- a/src/test/regress/sql/inherit.sql +++ b/src/test/regress/sql/inherit.sql @@ -154,6 +154,23 @@ where parted_tab.a = ss.a; select tableoid::regclass::text as relname, parted_tab.* from parted_tab order by 1,2; drop table parted_tab; + +-- Check UPDATE with multi-level partitioned inherited target +create table mlparted_tab (a int, b char, c text) partition by list (a); +create table mlparted_tab_part1 partition of mlparted_tab for values in (1); +create table mlparted_tab_part2 partition of mlparted_tab for values in (2) partition by list (b); +create table mlparted_tab_part3 partition of mlparted_tab for values in (3); +create table mlparted_tab_part2a partition of mlparted_tab_part2 for values in ('a'); +create table mlparted_tab_part2b partition of mlparted_tab_part2 for values in ('b'); +insert into mlparted_tab values (1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'); + +update mlparted_tab mlp set c = 'xxx' +from + (select a from some_tab union all select a+1 from some_tab) ss (a) +where (mlp.a = ss.a and mlp.b = 'b') or mlp.a = 3; +select tableoid::regclass::text as relname, mlparted_tab.* from mlparted_tab order by 1,2; + +drop table mlparted_tab; drop table some_tab cascade; /* Test multiple inheritance of column defaults */
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers