On Fri, May 14, 2021 at 12:22 PM David Rowley <dgrowle...@gmail.com> wrote:
> On Fri, 14 May 2021 at 11:22, Tom Lane <t...@sss.pgh.pa.us> wrote: > > I recall somebody (David Rowley, maybe? Too lazy to check archives.) > > working on this idea awhile ago, but he didn't get to the point of > > a committable patch. > > Yeah. Me. The discussion is in [1]. > > David > > [1] > https://www.postgresql.org/message-id/flat/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A%40mail.gmail.com Hi: I read through that thread and summarized the current pending issue as below IIUC. a). The most challenging issue is this push down misleads the planner's rows estimation, which probably be worse than the lack of such push down. b). The new generated qual may increase the qual execution cost. c). Planning time is also increased but we can not gain much because of that. I just tried to address these issues as below based on the patch David has finished a long time ago. To address the row estimation issue, The most straightforward way to fix this is to ignore the derived clauses when figuring out the RelOptInfo->rows on base relation. To note which clause is derived from this patch, I added a new field "EquivalenceClass * derived" in RestrictInfo. and then added a included_derived option in clauselist_selectivity_ext, during the set_xxx_rel_size function, we can pass the included_derived=false. This strategy should be used in get_parameterized_baserel_size. In all the other cases, include_derived=true is used. which are finished in commit 2. (Commit 1 is Daivd's patch, I just rebased it) set enable_hashjoin to off; set enable_mergejoin to off; set enable_seqscan to on; regression=# explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand and a.thousand < 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=27.14..1090.67 rows=10740 width=488) (actual time=0.404..15.006 rows=10000 loops=1) -> Bitmap Heap Scan on tenk1 b (cost=26.84..385.26 rows=10000 width=244) (actual time=0.350..1.419 rows=1000 loops=1) Recheck Cond: (thousand < 100)\ Heap Blocks: exact=324 -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..24.34 rows=1074 width=0) (actual time=0.238..0.240 rows=1000 loops=1) Index Cond: (thousand < 100) -> Memoize (cost=0.30..0.47 rows=1 width=244) (actual time=0.002..0.006 rows=10 loops=1000) Cache Key: b.thousand Cache Mode: logical Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage: 277kB -> Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.29..0.46 rows=1 width=244) (actual time=0.010..0.032 rows=10 loops=100) Index Cond: ((thousand = b.thousand) AND (thousand < 100)) Planning Time: 2.459 ms Execution Time: 15.964 ms (14 rows) As shown above, with commit 2 the JoinRel's rows estimation is correct now. but it will mislead the DBA to read the plan. See Bitmap Heap Scan on tenk1 b (...rows=10000..) (... rows=1000 loops=1) This is because RelOptInfo->rows is not just used to calculate the joinrel.rows but also be used to show the set Path.rows at many places. I can't think of a better way than adding a new filtered_rows in RelOptInfo which the semantics is used for Path.rows purpose only. That is what commit 3 does. After commit 3, we can see: regression=# explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand and a.thousand < 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=24.90..459.16 rows=10740 width=488) (actual time=0.440..16.966 rows=10000 loops=1) -> Bitmap Heap Scan on tenk1 b (cost=24.61..383.03 rows=1074 width=244) (actual time=0.383..1.546 rows=1000 loops=1) Recheck Cond: (thousand < 100) Heap Blocks: exact=324 -> Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..24.34 rows=1074 width=0) (actual time=0.270..0.272 rows=1000 loops=1) Index Cond: (thousand < 100) -> Memoize (cost=0.30..0.47 rows=1 width=244) (actual time=0.002..0.008 rows=10 loops=1000) Cache Key: b.thousand Cache Mode: logical Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage: 277kB -> Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.29..0.46 rows=1 width=244) (actual time=0.012..0.050 rows=10 loops=100) Index Cond: ((thousand = b.thousand) AND (thousand < 100)) Planning Time: 2.578 ms Execution Time: 17.929 ms (14 rows) "Bitmap Heap Scan on tenk1 b (... rows=1074 ..) (.. rows=1000 loops=1)" shows the issue fixed. but There is something wrong below. Index Scan using tenk1_thous_tenthous on tenk1 a (cost=0.29..0.46 rows=1 width=244) (actual time=0.012..0.050 rows=10 loops=100) Index Cond: ((thousand = b.thousand) AND (thousand < 100)) Here the " (thousand < 100)" is from the user, not from this patch. and (thousand = b.thousand) AND (thousand < 100) has some correlation. I can't think of a solution for this. and fixing this issue is beyond the scope of this patch. So at this stage, I think the row estimation issue is gone. As the new generated equals increase the execution cost opinion, I think it is hard for planners to distinguish which quals deserves adding or not. Instead I just removed the quals execution during create_plan stage to remove the obviously duplicated qual executions. I only handled the case that the derived quals is executed at the same time with the restrinctInfo who's parent_ec is used to generate the derived quals. If I understand the RestrictInfo.parent_ec correctly, The cost of finding out the correlated quals in this patch are pretty low, see is_correlated_derived_clause. This is what commit 4 does. After we apply it, we can see the last demo above becomes to: regression=# explain analyze select * from tenk1 a join d_tenk2 b on a.thousand = b.thousand and a.thousand < 100; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=10000000000.30..10000002799.78 rows=20020 width=488) (actual time=0.051..26.080 rows=20000 loops=1) -> Seq Scan on tenk1 a (cost=10000000000.00..10000000470.00 rows=1001 width=244) (actual time=0.018..3.902 rows=1000 loops=1) Filter: (thousand < 100) Rows Removed by Filter: 9000 -> Memoize (cost=0.30..3.18 rows=20 width=244) (actual time=0.002..0.008 rows=20 loops=1000) Cache Key: a.thousand Cache Mode: logical Hits: 900 Misses: 100 Evictions: 0 Overflows: 0 Memory Usage: 546kB -> Index Scan using d_tenk2_thousand_idx on d_tenk2 b (cost=0.29..3.17 rows=20 width=244) (actual time=0.008..0.037 rows=20 loops=100) Index Cond: (thousand = a.thousand) Planning Time: 0.596 ms Execution Time: 27.502 ms (12 rows) The "thousand < 100" for b is removed during execution. Commit 5 reduced the requirements for this path to work. Now it supports ScalarArrayOpExpr and any perudoconstant filter to support the user case I meet. Commit 6 added some testcase and they are just used for review since there are two many runtime statistics in the output and I can't think of way to fix it. I also study David's commit 1, and the semantics of ec_filters is so accurate and I'm very excited to see it. This patch series is still in the PoC stage, so something is not handled at all. For commit 2, I didn't handle extended statistics related paths and I just handled plain rel (subquery, forign table and so on are missed). I think it is OK for a PoC. At last, I will share some performance testing for this patch. This is the real user case I met. create table p (a int, b int) partition by range(a); select 'create table p_' || i || ' partition of p for values from (' || (i-1) * 100000 || ') to (' || i * 100000 || ');' from generate_series(1, 50)i; \gexec insert into p select i, i from generate_series(1, 50 * 100000 -1) i; create index on p(a); create table q (a int, b int) partition by range(a); select 'create table q_' || i || ' partition of q for values from (' || (i-1) * 100000 || ') to (' || i * 100000 || ');' from generate_series(1, 50)i; \gexec insert into q select * from p; create index on q(a); select * from p, q where p.a = q.a and p.a in (3, 200000); Run the above query in both prepared and no prepared case, I get the following results: | workload | with this feature | w/o this feature | |--------------+-------------------+------------------| | Prepared | 0.25 ms | 0.8 ms | | Non Prepared | 0.890 ms | 4.207 ms | Any thoughts? -- Best Regards Andy Fan
From 62e161902bf5d77089af772e4f458f05df368585 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 15:20:10 +0800 Subject: [PATCH v1 3/6] introduce RelOptInfo.filtered_rows. --- src/backend/optimizer/path/costsize.c | 21 ++++++++++++++++----- src/include/nodes/pathnodes.h | 2 ++ 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 9e303877af7..b1117257a38 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -241,7 +241,7 @@ cost_seqscan(Path *path, PlannerInfo *root, if (param_info) path->rows = param_info->ppi_rows; else - path->rows = baserel->rows; + path->rows = baserel->filtered_rows; if (!enable_seqscan) startup_cost += disable_cost; @@ -539,7 +539,7 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count, } else { - path->path.rows = baserel->rows; + path->path.rows = baserel->filtered_rows; /* qpquals come from just the rel's restriction clauses */ qpquals = extract_nonindex_conditions(path->indexinfo->indrestrictinfo, path->indexclauses); @@ -978,7 +978,7 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel, if (param_info) path->rows = param_info->ppi_rows; else - path->rows = baserel->rows; + path->rows = baserel->filtered_rows; if (!enable_bitmapscan) startup_cost += disable_cost; @@ -1209,7 +1209,7 @@ cost_tidscan(Path *path, PlannerInfo *root, if (param_info) path->rows = param_info->ppi_rows; else - path->rows = baserel->rows; + path->rows = baserel->filtered_rows; /* Count how many tuples we expect to retrieve */ ntuples = 0; @@ -1320,7 +1320,7 @@ cost_tidrangescan(Path *path, PlannerInfo *root, if (param_info) path->rows = param_info->ppi_rows; else - path->rows = baserel->rows; + path->rows = baserel->filtered_rows; /* Count how many tuples and pages we expect to scan */ selectivity = clauselist_selectivity(root, tidrangequals, baserel->relid, @@ -4938,6 +4938,17 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel) rel->rows = clamp_row_est(nrows); + nrows = rel->tuples * + clauselist_selectivity_ext(root, + rel->baserestrictinfo, + 0, + JOIN_INNER, + NULL, + true, + true /* include_derived, for filtered rows */); + + rel->filtered_rows = clamp_row_est(nrows); + cost_qual_eval(&rel->baserestrictcost, rel->baserestrictinfo, root); set_rel_width(root, rel); diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 42368e10b8e..dbe1775f96d 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -683,6 +683,8 @@ typedef struct RelOptInfo /* size estimates generated by planner */ Cardinality rows; /* estimated number of result tuples */ + Cardinality filtered_rows; /* filtered rows */ + /* per-relation planner control flags */ bool consider_startup; /* keep cheap-startup-cost paths? */ bool consider_param_startup; /* ditto, for parameterized paths? */ -- 2.21.0
From dee63e6e198ecfe66638910b8764dc5ba6e96e7b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 20:56:40 +0800 Subject: [PATCH v1 1/6] Rebaee David's patch against the latest code. --- src/backend/nodes/outfuncs.c | 14 ++ src/backend/optimizer/path/equivclass.c | 182 +++++++++++++++++++++++ src/backend/optimizer/plan/initsplan.c | 96 +++++++++--- src/backend/utils/cache/lsyscache.c | 28 ++++ src/include/nodes/nodes.h | 1 + src/include/nodes/pathnodes.h | 37 +++++ src/include/optimizer/paths.h | 1 + src/include/utils/lsyscache.h | 1 + src/test/regress/expected/equivclass.out | 45 ++++++ 9 files changed, 388 insertions(+), 17 deletions(-) diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 2b0236937aa..504b805326f 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2504,6 +2504,17 @@ _outEquivalenceMember(StringInfo str, const EquivalenceMember *node) WRITE_OID_FIELD(em_datatype); } +static void +_outEquivalenceFilter(StringInfo str, const EquivalenceFilter *node) +{ + WRITE_NODE_TYPE("EQUIVALENCEFILTER"); + + WRITE_NODE_FIELD(ef_const); + WRITE_OID_FIELD(ef_opno); + WRITE_BOOL_FIELD(ef_const_is_left); + WRITE_UINT_FIELD(ef_source_rel); +} + static void _outPathKey(StringInfo str, const PathKey *node) { @@ -4304,6 +4315,9 @@ outNode(StringInfo str, const void *obj) case T_EquivalenceMember: _outEquivalenceMember(str, obj); break; + case T_EquivalenceFilter: + _outEquivalenceFilter(str, obj); + break; case T_PathKey: _outPathKey(str, obj); break; diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 8c6770de972..f9ae2785d60 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -19,6 +19,7 @@ #include <limits.h> #include "access/stratnum.h" +#include "catalog/pg_am.h" #include "catalog/pg_type.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" @@ -1250,6 +1251,37 @@ generate_base_implied_equalities_const(PlannerInfo *root, } } +/* + * finds the opfamily and strategy number for the specified 'opno' and 'method' + * access method. Returns True if one is found and sets 'family' and + * 'amstrategy', or returns False if none are found. + */ +static bool +find_am_family_and_stategy(Oid opno, Oid method, Oid *family, int *amstrategy) +{ + List *opfamilies; + ListCell *l; + int strategy; + + opfamilies = get_opfamilies(opno, method); + + foreach(l, opfamilies) + { + Oid opfamily = lfirst_oid(l); + + strategy = get_op_opfamily_strategy(opno, opfamily); + + if (strategy) + { + *amstrategy = strategy; + *family = opfamily; + return true; + } + } + + return false; +} + /* * generate_base_implied_equalities when EC contains no pseudoconstants */ @@ -1259,6 +1291,7 @@ generate_base_implied_equalities_no_const(PlannerInfo *root, { EquivalenceMember **prev_ems; ListCell *lc; + ListCell *lc2; /* * We scan the EC members once and track the last-seen member for each @@ -1320,6 +1353,57 @@ generate_base_implied_equalities_no_const(PlannerInfo *root, rinfo->right_em = cur_em; } } + + /* + * Also push any EquivalenceFilter clauses down into all relations + * other than the one which the filter actually originated from. + */ + foreach(lc2, ec->ec_filters) + { + EquivalenceFilter *ef = (EquivalenceFilter *) lfirst(lc2); + Expr *leftexpr; + Expr *rightexpr; + int strategy; + Oid opno; + Oid family; + + if (ef->ef_source_rel == relid) + continue; + + if (!find_am_family_and_stategy(ef->ef_opno, BTREE_AM_OID, + &family, &strategy)) + continue; + + if (ef->ef_const_is_left) + { + leftexpr = (Expr *) ef->ef_const; + rightexpr = cur_em->em_expr; + } + else + { + leftexpr = cur_em->em_expr; + rightexpr = (Expr *) ef->ef_const; + } + + opno = get_opfamily_member(family, + exprType((Node *) leftexpr), + exprType((Node *) rightexpr), + strategy); + + if (opno == InvalidOid) + continue; + + process_implied_equality(root, opno, + ec->ec_collation, + leftexpr, + rightexpr, + bms_copy(ec->ec_relids), + bms_copy(cur_em->em_nullable_relids), + ec->ec_min_security, + ec->ec_below_outer_join, + false); + } + prev_ems[relid] = cur_em; } @@ -1901,6 +1985,104 @@ create_join_clause(PlannerInfo *root, return rinfo; } +/* + * distribute_filter_quals_to_eclass + * For each OpExpr in quallist look for an eclass which has an Expr + * matching the Expr in the OpExpr. If a match is found we add a new + * EquivalenceFilter to the eclass containing the filter details. + */ +void +distribute_filter_quals_to_eclass(PlannerInfo *root, List *quallist) +{ + ListCell *l; + + /* fast path for when no eclasses have been generated */ + if (root->eq_classes == NIL) + return; + + /* + * For each qual in quallist try and find an eclass which contains the + * non-Const part of the OpExpr. We'll tag any matches that we find onto + * the correct eclass. + */ + foreach(l, quallist) + { + OpExpr *opexpr = (OpExpr *) lfirst(l); + Expr *leftexpr = (Expr *) linitial(opexpr->args); + Expr *rightexpr = (Expr *) lsecond(opexpr->args); + Const *constexpr; + Expr *varexpr; + Relids exprrels; + int relid; + bool const_isleft; + ListCell *l2; + + /* + * Determine if the the OpExpr is in the form "expr op const" or + * "const op expr". + */ + if (IsA(leftexpr, Const)) + { + constexpr = (Const *) leftexpr; + varexpr = rightexpr; + const_isleft = true; + } + else + { + constexpr = (Const *) rightexpr; + varexpr = leftexpr; + const_isleft = false; + } + + exprrels = pull_varnos(root, (Node *) varexpr); + + /* should be filtered out, but we need to determine relid anyway */ + if (!bms_get_singleton_member(exprrels, &relid)) + continue; + + /* search for a matching eclass member in all eclasses */ + foreach(l2, root->eq_classes) + { + EquivalenceClass *ec = (EquivalenceClass *) lfirst(l2); + ListCell *l3; + + if (ec->ec_broken || ec->ec_has_volatile) + continue; + + /* + * if the eclass has a const then that const will serve as the + * filter, we needn't add any others. + */ + if (ec->ec_has_const) + continue; + + /* skip this eclass no members exist which belong to this relid */ + if (!bms_is_member(relid, ec->ec_relids)) + continue; + + foreach(l3, ec->ec_members) + { + EquivalenceMember *em = (EquivalenceMember *) lfirst(l3); + + if (!bms_is_member(relid, em->em_relids)) + continue; + + if (equal(em->em_expr, varexpr)) + { + EquivalenceFilter *efilter; + efilter = makeNode(EquivalenceFilter); + efilter->ef_const = (Const *) copyObject(constexpr); + efilter->ef_const_is_left = const_isleft; + efilter->ef_opno = opexpr->opno; + efilter->ef_source_rel = relid; + + ec->ec_filters = lappend(ec->ec_filters, efilter); + break; /* Onto the next eclass */ + } + } + } + } +} /* * reconsider_outer_join_clauses diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index 023efbaf092..b219bee8567 100644 --- a/src/backend/optimizer/plan/initsplan.c +++ b/src/backend/optimizer/plan/initsplan.c @@ -53,7 +53,7 @@ static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel, static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, Relids *qualscope, Relids *inner_join_rels, - List **postponed_qual_list); + List **postponed_qual_list, List **filter_qual_list); static void process_security_barrier_quals(PlannerInfo *root, int rti, Relids qualscope, bool below_outer_join); @@ -70,7 +70,8 @@ static void distribute_qual_to_rels(PlannerInfo *root, Node *clause, Relids qualscope, Relids ojscope, Relids outerjoin_nonnullable, - List **postponed_qual_list); + List **postponed_qual_list, + List **filter_qual_list); static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p, Relids *nullable_relids_p, bool is_pushed_down); static bool check_equivalence_delay(PlannerInfo *root, @@ -650,6 +651,43 @@ create_lateral_join_info(PlannerInfo *root) } } +/* + * is_simple_filter_qual + * Analyzes an OpExpr to determine if it may be useful as an + * EquivalenceFilter. Returns true if the OpExpr may be of some use, or + * false if it should not be used. + */ +static bool +is_simple_filter_qual(PlannerInfo *root, OpExpr *expr) +{ + Expr *leftexpr; + Expr *rightexpr; + + if (!IsA(expr, OpExpr)) + return false; + + if (list_length(expr->args) != 2) + return false; + + leftexpr = (Expr *) linitial(expr->args); + rightexpr = (Expr *) lsecond(expr->args); + + /* XXX should we restrict these to simple Var op Const expressions? */ + if (IsA(leftexpr, Const)) + { + if (bms_membership(pull_varnos(root, (Node *) rightexpr)) == BMS_SINGLETON && + !contain_volatile_functions((Node *) rightexpr)) + return true; + } + else if (IsA(rightexpr, Const)) + { + if (bms_membership(pull_varnos(root, (Node *) leftexpr)) == BMS_SINGLETON && + !contain_volatile_functions((Node *) leftexpr)) + return true; + } + + return false; +} /***************************************************************************** * @@ -690,6 +728,7 @@ deconstruct_jointree(PlannerInfo *root) Relids qualscope; Relids inner_join_rels; List *postponed_qual_list = NIL; + List *filter_qual_list = NIL; /* Start recursion at top of jointree */ Assert(root->parse->jointree != NULL && @@ -700,11 +739,14 @@ deconstruct_jointree(PlannerInfo *root) result = deconstruct_recurse(root, (Node *) root->parse->jointree, false, &qualscope, &inner_join_rels, - &postponed_qual_list); + &postponed_qual_list, &filter_qual_list); /* Shouldn't be any leftover quals */ Assert(postponed_qual_list == NIL); + /* try and match each filter_qual_list item up with an eclass. */ + distribute_filter_quals_to_eclass(root, filter_qual_list); + return result; } @@ -725,6 +767,8 @@ deconstruct_jointree(PlannerInfo *root) * or free this, either) * *postponed_qual_list is a list of PostponedQual structs, which we can * add quals to if they turn out to belong to a higher join level + * *filter_qual_list is appended to with a list of quals which may be useful + * include as EquivalenceFilters. * Return value is the appropriate joinlist for this jointree node * * In addition, entries will be added to root->join_info_list for outer joins. @@ -732,7 +776,7 @@ deconstruct_jointree(PlannerInfo *root) static List * deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, Relids *qualscope, Relids *inner_join_rels, - List **postponed_qual_list) + List **postponed_qual_list, List **filter_qual_list) { List *joinlist; @@ -785,7 +829,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, below_outer_join, &sub_qualscope, inner_join_rels, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); *qualscope = bms_add_members(*qualscope, sub_qualscope); sub_members = list_length(sub_joinlist); remaining--; @@ -819,7 +864,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, below_outer_join, JOIN_INNER, root->qual_security_level, *qualscope, NULL, NULL, - NULL); + NULL, + filter_qual_list); else *postponed_qual_list = lappend(*postponed_qual_list, pq); } @@ -835,7 +881,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, below_outer_join, JOIN_INNER, root->qual_security_level, *qualscope, NULL, NULL, - postponed_qual_list); + postponed_qual_list, + filter_qual_list); } } else if (IsA(jtnode, JoinExpr)) @@ -873,11 +920,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, leftjoinlist = deconstruct_recurse(root, j->larg, below_outer_join, &leftids, &left_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); rightjoinlist = deconstruct_recurse(root, j->rarg, below_outer_join, &rightids, &right_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); *qualscope = bms_union(leftids, rightids); *inner_join_rels = *qualscope; /* Inner join adds no restrictions for quals */ @@ -890,11 +939,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, leftjoinlist = deconstruct_recurse(root, j->larg, below_outer_join, &leftids, &left_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); rightjoinlist = deconstruct_recurse(root, j->rarg, true, &rightids, &right_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); *qualscope = bms_union(leftids, rightids); *inner_join_rels = bms_union(left_inners, right_inners); nonnullable_rels = leftids; @@ -904,11 +955,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, leftjoinlist = deconstruct_recurse(root, j->larg, below_outer_join, &leftids, &left_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); rightjoinlist = deconstruct_recurse(root, j->rarg, below_outer_join, &rightids, &right_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); *qualscope = bms_union(leftids, rightids); *inner_join_rels = bms_union(left_inners, right_inners); /* Semi join adds no restrictions for quals */ @@ -925,11 +978,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, leftjoinlist = deconstruct_recurse(root, j->larg, true, &leftids, &left_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); rightjoinlist = deconstruct_recurse(root, j->rarg, true, &rightids, &right_inners, - &child_postponed_quals); + &child_postponed_quals, + filter_qual_list); *qualscope = bms_union(leftids, rightids); *inner_join_rels = bms_union(left_inners, right_inners); /* each side is both outer and inner */ @@ -1013,7 +1068,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join, root->qual_security_level, *qualscope, ojscope, nonnullable_rels, - postponed_qual_list); + postponed_qual_list, + filter_qual_list); } /* Now we can add the SpecialJoinInfo to join_info_list */ @@ -1117,6 +1173,7 @@ process_security_barrier_quals(PlannerInfo *root, qualscope, qualscope, NULL, + NULL, NULL); } security_level++; @@ -1610,7 +1667,8 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause, Relids qualscope, Relids ojscope, Relids outerjoin_nonnullable, - List **postponed_qual_list) + List **postponed_qual_list, + List **filter_qual_list) { Relids relids; bool is_pushed_down; @@ -1964,6 +2022,10 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause, /* No EC special case applies, so push it into the clause lists */ distribute_restrictinfo_to_rels(root, restrictinfo); + + /* Check if the qual looks useful to harvest as an EquivalenceFilter */ + if (filter_qual_list != NULL && is_simple_filter_qual(root, (OpExpr *) clause)) + *filter_qual_list = lappend(*filter_qual_list, clause); } /* diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index feef9998634..add2e7176ae 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @@ -341,6 +341,34 @@ get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type) return result; } +/* + * get_opfamilies + * Returns a list of Oids of each opfamily which 'opno' belonging to + * 'method' access method. + */ +List * +get_opfamilies(Oid opno, Oid method) +{ + List *result = NIL; + CatCList *catlist; + int i; + + catlist = SearchSysCacheList1(AMOPOPID, ObjectIdGetDatum(opno)); + + for (i = 0; i < catlist->n_members; i++) + { + HeapTuple tuple = &catlist->members[i]->tuple; + Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple); + + if (aform->amopmethod == method) + result = lappend_oid(result, aform->amopfamily); + } + + ReleaseSysCacheList(catlist); + + return result; +} + /* * get_mergejoin_opfamilies * Given a putatively mergejoinable operator, return a list of the OIDs diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index da35f2c2722..3a9a235cd0e 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -265,6 +265,7 @@ typedef enum NodeTag /* these aren't subclasses of Path: */ T_EquivalenceClass, T_EquivalenceMember, + T_EquivalenceFilter, T_PathKey, T_PathTarget, T_RestrictInfo, diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 1f3845b3fec..e73fef057a4 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -990,6 +990,7 @@ typedef struct EquivalenceClass List *ec_members; /* list of EquivalenceMembers */ List *ec_sources; /* list of generating RestrictInfos */ List *ec_derives; /* list of derived RestrictInfos */ + List *ec_filters; Relids ec_relids; /* all relids appearing in ec_members, except * for child members (see below) */ bool ec_has_const; /* any pseudoconstants in ec_members? */ @@ -1002,6 +1003,42 @@ typedef struct EquivalenceClass struct EquivalenceClass *ec_merged; /* set if merged into another EC */ } EquivalenceClass; +/* + * EquivalenceFilter - List of filters on Consts which belong to the + * EquivalenceClass. + * + * When building the equivalence classes we also collected a list of quals in + * the form of; "Expr op Const" and "Const op Expr". These are collected in the + * hope that we'll later generate an equivalence class which contains the + * "Expr" part. For example, if we parse a query such as; + * + * SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.id WHERE t1.id < 10; + * + * then since we'll end up with an equivalence class containing {t1.id,t2.id}, + * we'll tag the "< 10" filter onto the eclass. We are able to do this because + * the eclass proves equality between each class member, therefore all members + * must be below 10. + * + * EquivalenceFilters store the details required to allow us to push these + * filter clauses down into other relations which share an equivalence class + * containing a member which matches the expression of this EquivalenceFilter. + * + * ef_const is the Const value which this filter should filter against. + * ef_opno is the operator to filter on. + * ef_const_is_left marks if the OpExpr was in the form "Const op Expr" or + * "Expr op Const". + * ef_source_rel is the relation id of where this qual originated from. + */ +typedef struct EquivalenceFilter +{ + NodeTag type; + + Const *ef_const; /* the constant expression to filter on */ + Oid ef_opno; /* Operator Oid of filter operator */ + bool ef_const_is_left; /* Is the Const on the left of the OpExrp? */ + Index ef_source_rel; /* relid of originating relation. */ +} EquivalenceFilter; + /* * If an EC contains a const and isn't below-outer-join, any PathKey depending * on it must be redundant, since there's only one possible value of the key. diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 0c3a0b90c85..ce2aac7d3aa 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -126,6 +126,7 @@ extern bool process_equivalence(PlannerInfo *root, extern Expr *canonicalize_ec_expression(Expr *expr, Oid req_type, Oid req_collation); extern void reconsider_outer_join_clauses(PlannerInfo *root); +extern void distribute_filter_quals_to_eclass(PlannerInfo *root, List *quallist); extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root, Expr *expr, Relids nullable_relids, diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index b8dd27d4a96..188d65faa0d 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -78,6 +78,7 @@ extern bool get_ordering_op_properties(Oid opno, Oid *opfamily, Oid *opcintype, int16 *strategy); extern Oid get_equality_op_for_ordering_op(Oid opno, bool *reverse); extern Oid get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type); +extern List *get_opfamilies(Oid opno, Oid method); extern List *get_mergejoin_opfamilies(Oid opno); extern bool get_compatible_hash_operators(Oid opno, Oid *lhs_opno, Oid *rhs_opno); diff --git a/src/test/regress/expected/equivclass.out b/src/test/regress/expected/equivclass.out index 126f7047fed..ce4c9b11748 100644 --- a/src/test/regress/expected/equivclass.out +++ b/src/test/regress/expected/equivclass.out @@ -451,3 +451,48 @@ explain (costs off) -- this should not require a sort Filter: (f1 = 'foo'::name) (2 rows) +-- test equivalence filters +explain (costs off) + select * from ec0 + inner join ec1 on ec0.ff = ec1.ff + where ec0.ff between 1 and 10; + QUERY PLAN +------------------------------------------------------------ + Merge Join + Merge Cond: (ec0.ff = ec1.ff) + -> Sort + Sort Key: ec0.ff + -> Bitmap Heap Scan on ec0 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec0_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) + -> Sort + Sort Key: ec1.ff + -> Bitmap Heap Scan on ec1 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec1_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) +(14 rows) + +explain (costs off) + select * from ec0 + inner join ec1 on ec0.ff = ec1.ff + where ec1.ff between 1 and 10; + QUERY PLAN +------------------------------------------------------------ + Merge Join + Merge Cond: (ec0.ff = ec1.ff) + -> Sort + Sort Key: ec0.ff + -> Bitmap Heap Scan on ec0 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec0_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) + -> Sort + Sort Key: ec1.ff + -> Bitmap Heap Scan on ec1 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec1_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) +(14 rows) + -- 2.21.0
From f87f52db53b7caf56d0bb04138cb0357d2db2223 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 20:58:07 +0800 Subject: [PATCH v1 5/6] Support ScalarArrayOpExpr and perudoconstant on ef_filter. --- src/backend/optimizer/path/equivclass.c | 110 +++++++++++++++++------- src/backend/optimizer/plan/initsplan.c | 59 ++++++++----- src/include/nodes/pathnodes.h | 3 +- 3 files changed, 120 insertions(+), 52 deletions(-) diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 9ce0249b10d..4271bacb070 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1386,23 +1386,55 @@ generate_base_implied_equalities_no_const(PlannerInfo *root, rightexpr = (Expr *) ef->ef_const; } - opno = get_opfamily_member(family, - exprType((Node *) leftexpr), - exprType((Node *) rightexpr), - strategy); + if (ef->ef_expr_type == T_OpExpr) + { + opno = get_opfamily_member(family, + exprType((Node *) leftexpr), + exprType((Node *) rightexpr), + strategy); - if (opno == InvalidOid) - continue; + if (opno == InvalidOid) + continue; - rinfo = process_implied_equality(root, opno, - ec->ec_collation, - leftexpr, - rightexpr, - bms_copy(ec->ec_relids), - bms_copy(cur_em->em_nullable_relids), - ec->ec_min_security, - ec->ec_below_outer_join, - false); + rinfo = process_implied_equality(root, opno, + ec->ec_collation, + leftexpr, + rightexpr, + bms_copy(ec->ec_relids), + bms_copy(cur_em->em_nullable_relids), + ec->ec_min_security, + ec->ec_below_outer_join, + false); + } + else + { + ScalarArrayOpExpr *arr_opexpr; + Relids relids; + + Assert(ef->ef_expr_type == T_ScalarArrayOpExpr); + arr_opexpr = makeNode(ScalarArrayOpExpr); + arr_opexpr->opno = ef->ef_opno; + arr_opexpr->opfuncid = get_opcode(ef->ef_opno); + arr_opexpr->useOr = true; /* See is_simple_filter_qual */ + arr_opexpr->args = list_make2(leftexpr, ef->ef_const); + arr_opexpr->inputcollid = ec->ec_collation; + + relids = pull_varnos(root, (Node *)arr_opexpr); + + rinfo = make_restrictinfo(root, + (Expr *) arr_opexpr, + true, + false, + false, /* perudoconstant */ + ec->ec_min_security, + relids, + NULL, + bms_copy(cur_em->em_nullable_relids)); + + // check_mergejoinable(rinfo); + + distribute_restrictinfo_to_rels(root, rinfo); + } rinfo->derived = ec; } @@ -2009,29 +2041,48 @@ distribute_filter_quals_to_eclass(PlannerInfo *root, List *quallist) */ foreach(l, quallist) { - OpExpr *opexpr = (OpExpr *) lfirst(l); - Expr *leftexpr = (Expr *) linitial(opexpr->args); - Expr *rightexpr = (Expr *) lsecond(opexpr->args); - Const *constexpr; - Expr *varexpr; + Expr *expr = lfirst(l); + + List *args; + Node *leftexpr; + Node *rightexpr; + Node *constexpr; + Node *varexpr; + Oid opno; + Relids exprrels; int relid; bool const_isleft; ListCell *l2; - /* - * Determine if the the OpExpr is in the form "expr op const" or - * "const op expr". - */ - if (IsA(leftexpr, Const)) + if (nodeTag(expr) == T_OpExpr) + { + OpExpr *opexpr = (OpExpr *) lfirst(l); + args = opexpr->args; + opno = opexpr->opno; + } + else if (nodeTag(expr) == T_ScalarArrayOpExpr) + { + ScalarArrayOpExpr *arr_expr = (ScalarArrayOpExpr *) lfirst(l); + args = arr_expr->args; + opno = arr_expr->opno; + } + else + { + Assert(false); + } + + leftexpr = linitial(args); + rightexpr = lsecond(args); + if (is_pseudo_constant_clause(leftexpr)) { - constexpr = (Const *) leftexpr; + constexpr = leftexpr; varexpr = rightexpr; const_isleft = true; } else { - constexpr = (Const *) rightexpr; + constexpr = rightexpr; varexpr = leftexpr; const_isleft = false; } @@ -2073,10 +2124,11 @@ distribute_filter_quals_to_eclass(PlannerInfo *root, List *quallist) { EquivalenceFilter *efilter; efilter = makeNode(EquivalenceFilter); - efilter->ef_const = (Const *) copyObject(constexpr); + efilter->ef_const = (Node *)copyObject(constexpr); efilter->ef_const_is_left = const_isleft; - efilter->ef_opno = opexpr->opno; + efilter->ef_opno = opno; efilter->ef_source_rel = relid; + efilter->ef_expr_type = nodeTag(expr); ec->ec_filters = lappend(ec->ec_filters, efilter); break; /* Onto the next eclass */ diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c index b219bee8567..a0f12198b8c 100644 --- a/src/backend/optimizer/plan/initsplan.c +++ b/src/backend/optimizer/plan/initsplan.c @@ -656,33 +656,42 @@ create_lateral_join_info(PlannerInfo *root) * Analyzes an OpExpr to determine if it may be useful as an * EquivalenceFilter. Returns true if the OpExpr may be of some use, or * false if it should not be used. + * */ static bool -is_simple_filter_qual(PlannerInfo *root, OpExpr *expr) +is_simple_filter_qual(PlannerInfo *root, Expr *expr) { - Expr *leftexpr; - Expr *rightexpr; - - if (!IsA(expr, OpExpr)) - return false; - - if (list_length(expr->args) != 2) - return false; - - leftexpr = (Expr *) linitial(expr->args); - rightexpr = (Expr *) lsecond(expr->args); - - /* XXX should we restrict these to simple Var op Const expressions? */ - if (IsA(leftexpr, Const)) + Node *leftexpr; + Node *rightexpr; + List *args = NIL; + + if (IsA(expr, OpExpr)) + args = castNode(OpExpr, expr)->args; + else if (IsA(expr, ScalarArrayOpExpr)) + { + ScalarArrayOpExpr *arr_opexpr = castNode(ScalarArrayOpExpr, expr); + if (!arr_opexpr->useOr) + /* Just support IN for now */ + return false; + args = arr_opexpr->args; + } + + if (list_length(args) != 2) + return false; + + leftexpr = linitial(args); + rightexpr = lsecond(args); + + if (is_pseudo_constant_clause(leftexpr)) { - if (bms_membership(pull_varnos(root, (Node *) rightexpr)) == BMS_SINGLETON && - !contain_volatile_functions((Node *) rightexpr)) + if (bms_membership(pull_varnos(root, rightexpr)) == BMS_SINGLETON && + !contain_volatile_functions(rightexpr)) return true; } - else if (IsA(rightexpr, Const)) + else if (is_pseudo_constant_clause(rightexpr)) { - if (bms_membership(pull_varnos(root, (Node *) leftexpr)) == BMS_SINGLETON && - !contain_volatile_functions((Node *) leftexpr)) + if (bms_membership(pull_varnos(root, leftexpr)) == BMS_SINGLETON && + !contain_volatile_functions(leftexpr)) return true; } @@ -739,7 +748,13 @@ deconstruct_jointree(PlannerInfo *root) result = deconstruct_recurse(root, (Node *) root->parse->jointree, false, &qualscope, &inner_join_rels, - &postponed_qual_list, &filter_qual_list); + &postponed_qual_list, + /* + * geqo option here is just used for testing + * during review stage, set enable_geqo to + * false to disable this feature. + */ + enable_geqo ? &filter_qual_list : NULL); /* Shouldn't be any leftover quals */ Assert(postponed_qual_list == NIL); @@ -2024,7 +2039,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause, distribute_restrictinfo_to_rels(root, restrictinfo); /* Check if the qual looks useful to harvest as an EquivalenceFilter */ - if (filter_qual_list != NULL && is_simple_filter_qual(root, (OpExpr *) clause)) + if (filter_qual_list != NULL && is_simple_filter_qual(root, (Expr *) clause)) *filter_qual_list = lappend(*filter_qual_list, clause); } diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index dbe1775f96d..c1ef0066188 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -1035,10 +1035,11 @@ typedef struct EquivalenceFilter { NodeTag type; - Const *ef_const; /* the constant expression to filter on */ + Node *ef_const; /* pseudo const */ Oid ef_opno; /* Operator Oid of filter operator */ bool ef_const_is_left; /* Is the Const on the left of the OpExrp? */ Index ef_source_rel; /* relid of originating relation. */ + NodeTag ef_expr_type; } EquivalenceFilter; /* -- 2.21.0
From 9d8e8e11253e177f24436fbdbc8aff0715b5dd39 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 17:37:27 +0800 Subject: [PATCH v1 4/6] remove duplicated qual executing. --- src/backend/optimizer/path/equivclass.c | 22 +++++++++++++++++++ src/backend/optimizer/plan/createplan.c | 29 +++++++++++++++++++++++-- src/include/optimizer/paths.h | 2 ++ src/test/regress/parallel_schedule | 2 ++ 4 files changed, 53 insertions(+), 2 deletions(-) diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 6ed9e8c9064..9ce0249b10d 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -3350,6 +3350,28 @@ is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist) return false; } + +bool +is_correlated_derived_clause(RestrictInfo *rinfo, List *clauselist) +{ + EquivalenceClass *derived_ec = rinfo->derived; + ListCell *lc; + + /* Fail if it's not a potentially-derived clause from some EC */ + if (derived_ec == NULL) + return false; + + foreach(lc, clauselist) + { + RestrictInfo *otherrinfo = (RestrictInfo *) lfirst(lc); + + if (otherrinfo->parent_ec == derived_ec) + return true; + } + + return false; +} + /* * is_redundant_with_indexclauses * Test whether rinfo is redundant with any clause in the IndexClause diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index cd6d72c7633..03c5207ede0 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -556,11 +556,14 @@ static Plan * create_scan_plan(PlannerInfo *root, Path *best_path, int flags) { RelOptInfo *rel = best_path->parent; - List *scan_clauses; + List *scan_clauses = NIL; List *gating_clauses; List *tlist; Plan *plan; + List *ppi_clauses = best_path->param_info ? best_path->param_info->ppi_clauses : NIL; + ListCell *lc; + /* * Extract the relevant restriction clauses from the parent relation. The * executor must apply all these restrictions during the scan, except for @@ -591,8 +594,18 @@ create_scan_plan(PlannerInfo *root, Path *best_path, int flags) * For paranoia's sake, don't modify the stored baserestrictinfo list. */ if (best_path->param_info) - scan_clauses = list_concat_copy(scan_clauses, + { + List *stripped_quals = NIL; + foreach(lc, rel->baserestrictinfo) + { + RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc); + + if (!is_correlated_derived_clause(rinfo, ppi_clauses)) + stripped_quals = lappend(stripped_quals, rinfo); + } + scan_clauses = list_concat_copy(stripped_quals, best_path->param_info->ppi_clauses); + } /* * Detect whether we have any pseudoconstant quals to deal with. Then, if @@ -4912,6 +4925,7 @@ fix_indexqual_references(PlannerInfo *root, IndexPath *index_path, List *stripped_indexquals; List *fixed_indexquals; ListCell *lc; + List *ppi_clauses = index_path->path.param_info ? index_path->path.param_info->ppi_clauses : NIL; stripped_indexquals = fixed_indexquals = NIL; @@ -4921,6 +4935,17 @@ fix_indexqual_references(PlannerInfo *root, IndexPath *index_path, int indexcol = iclause->indexcol; ListCell *lc2; + if (is_correlated_derived_clause(iclause->rinfo, ppi_clauses)) + { + /* + * bitmapscan will read this indexquals as well. so we can't just igrore + * it for now. we can totally delete it. + */ + index_path->indexclauses = foreach_delete_current(index_path->indexclauses, lc); + continue; + } + + foreach(lc2, iclause->indexquals) { RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc2); diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index ce2aac7d3aa..08d8612cdba 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -188,6 +188,8 @@ extern bool eclass_useful_for_merging(PlannerInfo *root, extern bool is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist); extern bool is_redundant_with_indexclauses(RestrictInfo *rinfo, List *indexclauses); +extern bool is_correlated_derived_clause(RestrictInfo *rinfo, List *clauselist); + /* * pathkeys.c diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule index 861c30a73ae..b071324b961 100644 --- a/src/test/regress/parallel_schedule +++ b/src/test/regress/parallel_schedule @@ -134,3 +134,5 @@ test: fast_default # run stats by itself because its delay may be insufficient under heavy load test: stats + +test: ec_filter \ No newline at end of file -- 2.21.0
From d34e73aad40e4b54ce61dce65da810abfb608b23 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 14:54:07 +0800 Subject: [PATCH v1 2/6] support set_xxx_size with derived_clauses ignored. --- src/backend/optimizer/path/clausesel.c | 34 +++++++++++++++++-------- src/backend/optimizer/path/costsize.c | 25 ++++++++++-------- src/backend/optimizer/path/equivclass.c | 20 ++++++++------- src/backend/optimizer/util/inherit.c | 31 ++++++++++++---------- src/backend/statistics/dependencies.c | 3 ++- src/backend/statistics/extended_stats.c | 5 ++-- src/include/nodes/pathnodes.h | 1 + src/include/optimizer/optimizer.h | 6 +++-- 8 files changed, 77 insertions(+), 48 deletions(-) diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c index 06f836308d0..8961e66ea4e 100644 --- a/src/backend/optimizer/path/clausesel.c +++ b/src/backend/optimizer/path/clausesel.c @@ -106,7 +106,7 @@ clauselist_selectivity(PlannerInfo *root, SpecialJoinInfo *sjinfo) { return clauselist_selectivity_ext(root, clauses, varRelid, - jointype, sjinfo, true); + jointype, sjinfo, true, true); } /* @@ -121,7 +121,8 @@ clauselist_selectivity_ext(PlannerInfo *root, int varRelid, JoinType jointype, SpecialJoinInfo *sjinfo, - bool use_extended_stats) + bool use_extended_stats, + bool include_derived) { Selectivity s1 = 1.0; RelOptInfo *rel; @@ -137,7 +138,8 @@ clauselist_selectivity_ext(PlannerInfo *root, if (list_length(clauses) == 1) return clause_selectivity_ext(root, (Node *) linitial(clauses), varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, + include_derived); /* * Determine if these clauses reference a single relation. If so, and if @@ -183,7 +185,7 @@ clauselist_selectivity_ext(PlannerInfo *root, /* Compute the selectivity of this clause in isolation */ s2 = clause_selectivity_ext(root, clause, varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, include_derived); /* * Check for being passed a RestrictInfo. @@ -412,7 +414,9 @@ clauselist_selectivity_or(PlannerInfo *root, continue; s2 = clause_selectivity_ext(root, (Node *) lfirst(lc), varRelid, - jointype, sjinfo, use_extended_stats); + jointype, sjinfo, use_extended_stats, + true /* we never push a derived under or clause */ + ); s1 = s1 + s2 - s1 * s2; } @@ -694,7 +698,7 @@ clause_selectivity(PlannerInfo *root, SpecialJoinInfo *sjinfo) { return clause_selectivity_ext(root, clause, varRelid, - jointype, sjinfo, true); + jointype, sjinfo, true, true); } /* @@ -709,7 +713,8 @@ clause_selectivity_ext(PlannerInfo *root, int varRelid, JoinType jointype, SpecialJoinInfo *sjinfo, - bool use_extended_stats) + bool use_extended_stats, + bool include_derived) { Selectivity s1 = 0.5; /* default for any unhandled clause type */ RestrictInfo *rinfo = NULL; @@ -742,6 +747,9 @@ clause_selectivity_ext(PlannerInfo *root, if (rinfo->norm_selec > 1) return (Selectivity) 1.0; + if (rinfo->derived && !include_derived) + return (Selectivity) 1.0; + /* * If possible, cache the result of the selectivity calculation for * the clause. We can cache if varRelid is zero or the clause @@ -830,7 +838,8 @@ clause_selectivity_ext(PlannerInfo *root, varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, + include_derived); } else if (is_andclause(clause)) { @@ -840,7 +849,8 @@ clause_selectivity_ext(PlannerInfo *root, varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, + include_derived); } else if (is_orclause(clause)) { @@ -959,7 +969,8 @@ clause_selectivity_ext(PlannerInfo *root, varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, + include_derived); } else if (IsA(clause, CoerceToDomain)) { @@ -969,7 +980,8 @@ clause_selectivity_ext(PlannerInfo *root, varRelid, jointype, sjinfo, - use_extended_stats); + use_extended_stats, + include_derived); } else { diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 8dc7dd4ca26..9e303877af7 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -4928,11 +4928,13 @@ set_baserel_size_estimates(PlannerInfo *root, RelOptInfo *rel) Assert(rel->relid > 0); nrows = rel->tuples * - clauselist_selectivity(root, - rel->baserestrictinfo, - 0, - JOIN_INNER, - NULL); + clauselist_selectivity_ext(root, + rel->baserestrictinfo, + 0, + JOIN_INNER, + NULL, + true, + false /* include_derived */); rel->rows = clamp_row_est(nrows); @@ -4964,11 +4966,14 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel, */ allclauses = list_concat_copy(param_clauses, rel->baserestrictinfo); nrows = rel->tuples * - clauselist_selectivity(root, - allclauses, - rel->relid, /* do not use 0! */ - JOIN_INNER, - NULL); + clauselist_selectivity_ext(root, + allclauses, + rel->relid, /* do not use 0! */ + JOIN_INNER, + NULL, + true, + false /* doesn't include the derived clause */ + ); nrows = clamp_row_est(nrows); /* For safety, make sure result is not more than the base estimate */ if (nrows > rel->rows) diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index f9ae2785d60..6ed9e8c9064 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -1366,6 +1366,7 @@ generate_base_implied_equalities_no_const(PlannerInfo *root, int strategy; Oid opno; Oid family; + RestrictInfo *rinfo; if (ef->ef_source_rel == relid) continue; @@ -1393,15 +1394,16 @@ generate_base_implied_equalities_no_const(PlannerInfo *root, if (opno == InvalidOid) continue; - process_implied_equality(root, opno, - ec->ec_collation, - leftexpr, - rightexpr, - bms_copy(ec->ec_relids), - bms_copy(cur_em->em_nullable_relids), - ec->ec_min_security, - ec->ec_below_outer_join, - false); + rinfo = process_implied_equality(root, opno, + ec->ec_collation, + leftexpr, + rightexpr, + bms_copy(ec->ec_relids), + bms_copy(cur_em->em_nullable_relids), + ec->ec_min_security, + ec->ec_below_outer_join, + false); + rinfo->derived = ec; } prev_ems[relid] = cur_em; diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c index 7e134822f36..8a5863c4da4 100644 --- a/src/backend/optimizer/util/inherit.c +++ b/src/backend/optimizer/util/inherit.c @@ -797,6 +797,7 @@ apply_child_basequals(PlannerInfo *root, RelOptInfo *parentrel, { Node *onecq = (Node *) lfirst(lc2); bool pseudoconstant; + RestrictInfo *child_rinfo; /* check for pseudoconstant (no Vars or volatile functions) */ pseudoconstant = @@ -807,15 +808,19 @@ apply_child_basequals(PlannerInfo *root, RelOptInfo *parentrel, /* tell createplan.c to check for gating quals */ root->hasPseudoConstantQuals = true; } + + child_rinfo = make_restrictinfo(root, + (Expr *) onecq, + rinfo->is_pushed_down, + rinfo->outerjoin_delayed, + pseudoconstant, + rinfo->security_level, + NULL, NULL, NULL); + + child_rinfo->derived = rinfo->derived; /* reconstitute RestrictInfo with appropriate properties */ - childquals = lappend(childquals, - make_restrictinfo(root, - (Expr *) onecq, - rinfo->is_pushed_down, - rinfo->outerjoin_delayed, - pseudoconstant, - rinfo->security_level, - NULL, NULL, NULL)); + childquals = lappend(childquals, child_rinfo); + /* track minimum security level among child quals */ cq_min_security = Min(cq_min_security, rinfo->security_level); } @@ -844,13 +849,13 @@ apply_child_basequals(PlannerInfo *root, RelOptInfo *parentrel, foreach(lc2, qualset) { Expr *qual = (Expr *) lfirst(lc2); + RestrictInfo *rinfo = make_restrictinfo(root, qual, + true, false, false, + security_level, + NULL, NULL, NULL); /* not likely that we'd see constants here, so no check */ - childquals = lappend(childquals, - make_restrictinfo(root, qual, - true, false, false, - security_level, - NULL, NULL, NULL)); + childquals = lappend(childquals, rinfo); cq_min_security = Min(cq_min_security, security_level); } security_level++; diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 34326d55619..838446a220d 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1076,7 +1076,8 @@ clauselist_apply_dependencies(PlannerInfo *root, List *clauses, } simple_sel = clauselist_selectivity_ext(root, attr_clauses, varRelid, - jointype, sjinfo, false); + jointype, sjinfo, false, + true /* probably no reasonable */); attr_sel[attidx++] = simple_sel; } diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index ca48395d5c5..38836f58c4e 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1870,7 +1870,8 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli * columns/clauses. */ simple_sel = clause_selectivity_ext(root, clause, varRelid, - jointype, sjinfo, false); + jointype, sjinfo, false, + true); overlap_simple_sel = simple_or_sel * simple_sel; @@ -1943,7 +1944,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli */ simple_sel = clauselist_selectivity_ext(root, stat_clauses, varRelid, jointype, - sjinfo, false); + sjinfo, false, true); /* * Multi-column estimate using MCV statistics, along with base and diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index e73fef057a4..42368e10b8e 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -2165,6 +2165,7 @@ typedef struct RestrictInfo /* hash equality operators used for memoize nodes, else InvalidOid */ Oid left_hasheqoperator; Oid right_hasheqoperator; + EquivalenceClass *derived; } RestrictInfo; /* diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h index 6b8ee0c69fa..a3385ae51ff 100644 --- a/src/include/optimizer/optimizer.h +++ b/src/include/optimizer/optimizer.h @@ -68,7 +68,8 @@ extern Selectivity clause_selectivity_ext(PlannerInfo *root, int varRelid, JoinType jointype, SpecialJoinInfo *sjinfo, - bool use_extended_stats); + bool use_extended_stats, + bool include_derived); extern Selectivity clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid, @@ -79,7 +80,8 @@ extern Selectivity clauselist_selectivity_ext(PlannerInfo *root, int varRelid, JoinType jointype, SpecialJoinInfo *sjinfo, - bool use_extended_stats); + bool use_extended_stats, + bool include_derived); /* in path/costsize.c: */ -- 2.21.0
From ce6623ac58687c67e3e7576c212b4e352bc786b9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com> Date: Tue, 1 Feb 2022 22:57:13 +0800 Subject: [PATCH v1 6/6] adding some test cases for this feature and fix the existing case because of this feature. --- src/test/regress/expected/equivclass.out | 54 ++++----- src/test/regress/expected/join.out | 33 +++--- src/test/regress/expected/partition_join.out | 106 ++++++++++-------- src/test/regress/expected/partition_prune.out | 56 ++------- src/test/regress/sql/equivclass.sql | 12 ++ 5 files changed, 118 insertions(+), 143 deletions(-) diff --git a/src/test/regress/expected/equivclass.out b/src/test/regress/expected/equivclass.out index ce4c9b11748..a5042c3cbc3 100644 --- a/src/test/regress/expected/equivclass.out +++ b/src/test/regress/expected/equivclass.out @@ -456,43 +456,29 @@ explain (costs off) select * from ec0 inner join ec1 on ec0.ff = ec1.ff where ec0.ff between 1 and 10; - QUERY PLAN ------------------------------------------------------------- - Merge Join - Merge Cond: (ec0.ff = ec1.ff) - -> Sort - Sort Key: ec0.ff - -> Bitmap Heap Scan on ec0 - Recheck Cond: ((ff >= 1) AND (ff <= 10)) - -> Bitmap Index Scan on ec0_pkey - Index Cond: ((ff >= 1) AND (ff <= 10)) - -> Sort - Sort Key: ec1.ff - -> Bitmap Heap Scan on ec1 - Recheck Cond: ((ff >= 1) AND (ff <= 10)) - -> Bitmap Index Scan on ec1_pkey - Index Cond: ((ff >= 1) AND (ff <= 10)) -(14 rows) + QUERY PLAN +------------------------------------------------------------------ + Nested Loop + -> Bitmap Heap Scan on ec1 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec1_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) + -> Index Scan using ec0_pkey on ec0 + Index Cond: ((ff = ec1.ff) AND (ff >= 1) AND (ff <= 10)) +(7 rows) explain (costs off) select * from ec0 inner join ec1 on ec0.ff = ec1.ff where ec1.ff between 1 and 10; - QUERY PLAN ------------------------------------------------------------- - Merge Join - Merge Cond: (ec0.ff = ec1.ff) - -> Sort - Sort Key: ec0.ff - -> Bitmap Heap Scan on ec0 - Recheck Cond: ((ff >= 1) AND (ff <= 10)) - -> Bitmap Index Scan on ec0_pkey - Index Cond: ((ff >= 1) AND (ff <= 10)) - -> Sort - Sort Key: ec1.ff - -> Bitmap Heap Scan on ec1 - Recheck Cond: ((ff >= 1) AND (ff <= 10)) - -> Bitmap Index Scan on ec1_pkey - Index Cond: ((ff >= 1) AND (ff <= 10)) -(14 rows) + QUERY PLAN +------------------------------------------------------------------ + Nested Loop + -> Bitmap Heap Scan on ec0 + Recheck Cond: ((ff >= 1) AND (ff <= 10)) + -> Bitmap Index Scan on ec0_pkey + Index Cond: ((ff >= 1) AND (ff <= 10)) + -> Index Scan using ec1_pkey on ec1 + Index Cond: ((ff = ec0.ff) AND (ff >= 1) AND (ff <= 10)) +(7 rows) diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index d5b5b775fdd..7df4f93856e 100644 --- a/src/test/regress/expected/join.out +++ b/src/test/regress/expected/join.out @@ -3337,7 +3337,7 @@ where t1.unique2 < 42 and t1.stringu1 > t2.stringu2; Join Filter: (t1.stringu1 > t2.stringu2) -> Nested Loop -> Seq Scan on int4_tbl i1 - Filter: (f1 = 0) + Filter: ((f1 = 0) AND (11 < 42)) -> Index Scan using tenk1_unique2 on tenk1 t1 Index Cond: ((unique2 = (11)) AND (unique2 < 42)) -> Index Scan using tenk1_unique1 on tenk1 t2 @@ -6496,9 +6496,10 @@ where j1.id1 % 1000 = 1 and j2.id1 % 1000 = 1 and j2.id1 = any (array[1]); Merge Cond: (j1.id1 = j2.id1) Join Filter: (j1.id2 = j2.id2) -> Index Scan using j1_id1_idx on j1 + Index Cond: (id1 = ANY ('{1}'::integer[])) -> Index Scan using j2_id1_idx on j2 Index Cond: (id1 = ANY ('{1}'::integer[])) -(6 rows) +(7 rows) select * from j1 inner join j2 on j1.id1 = j2.id1 and j1.id2 = j2.id2 @@ -6513,16 +6514,17 @@ where j1.id1 % 1000 = 1 and j2.id1 % 1000 = 1 and j2.id1 = any (array[1]); explain (costs off) select * from j1 inner join j2 on j1.id1 = j2.id1 and j1.id2 = j2.id2 where j1.id1 % 1000 = 1 and j2.id1 % 1000 = 1 and j2.id1 >= any (array[1,5]); - QUERY PLAN -------------------------------------------------------- + QUERY PLAN +--------------------------------------------------------- Merge Join - Merge Cond: (j1.id1 = j2.id1) - Join Filter: (j1.id2 = j2.id2) - -> Index Scan using j1_id1_idx on j1 + Merge Cond: ((j1.id1 = j2.id1) AND (j1.id2 = j2.id2)) + -> Index Only Scan using j1_pkey on j1 + Index Cond: (id1 >= ANY ('{1,5}'::integer[])) + Filter: ((id1 % 1000) = 1) -> Index Only Scan using j2_pkey on j2 Index Cond: (id1 >= ANY ('{1,5}'::integer[])) Filter: ((id1 % 1000) = 1) -(7 rows) +(8 rows) select * from j1 inner join j2 on j1.id1 = j2.id1 and j1.id2 = j2.id2 @@ -6558,6 +6560,7 @@ where exists (select 1 from tenk1 t3 Group Key: t3.thousand, t3.tenthous -> Index Only Scan using tenk1_thous_tenthous on public.tenk1 t3 Output: t3.thousand, t3.tenthous + Index Cond: (t3.thousand < 1) -> Hash Output: t1.unique1 -> Index Only Scan using onek_unique1 on public.onek t1 @@ -6566,7 +6569,7 @@ where exists (select 1 from tenk1 t3 -> Index Only Scan using tenk1_hundred on public.tenk1 t2 Output: t2.hundred Index Cond: (t2.hundred = t3.tenthous) -(18 rows) +(19 rows) -- ... unless it actually is unique create table j3 as select unique1, tenthous from onek; @@ -6578,18 +6581,18 @@ from onek t1, tenk1 t2 where exists (select 1 from j3 where j3.unique1 = t1.unique1 and j3.tenthous = t2.hundred) and t1.unique1 < 1; - QUERY PLAN ------------------------------------------------------------------------- + QUERY PLAN +---------------------------------------------------------------------------- Nested Loop Output: t1.unique1, t2.hundred -> Nested Loop Output: t1.unique1, j3.tenthous - -> Index Only Scan using onek_unique1 on public.onek t1 - Output: t1.unique1 - Index Cond: (t1.unique1 < 1) -> Index Only Scan using j3_unique1_tenthous_idx on public.j3 Output: j3.unique1, j3.tenthous - Index Cond: (j3.unique1 = t1.unique1) + Index Cond: (j3.unique1 < 1) + -> Index Only Scan using onek_unique1 on public.onek t1 + Output: t1.unique1 + Index Cond: ((t1.unique1 = j3.unique1) AND (t1.unique1 < 1)) -> Index Only Scan using tenk1_hundred on public.tenk1 t2 Output: t2.hundred Index Cond: (t2.hundred = j3.tenthous) diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out index bb5b7c47a45..c12351a3b0a 100644 --- a/src/test/regress/expected/partition_join.out +++ b/src/test/regress/expected/partition_join.out @@ -186,17 +186,17 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0) -- Join with pruned partitions from joining relations EXPLAIN (COSTS OFF) SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a < 450 AND t2.b > 250 AND t1.b = 0 ORDER BY t1.a, t2.b; - QUERY PLAN ------------------------------------------------------ + QUERY PLAN +------------------------------------------------------------------- Sort Sort Key: t1.a -> Hash Join Hash Cond: (t2.b = t1.a) -> Seq Scan on prt2_p2 t2 - Filter: (b > 250) + Filter: ((b > 250) AND (b < 450)) -> Hash -> Seq Scan on prt1_p2 t1 - Filter: ((a < 450) AND (b = 0)) + Filter: ((a < 450) AND (a > 250) AND (b = 0)) (9 rows) SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a < 450 AND t2.b > 250 AND t1.b = 0 ORDER BY t1.a, t2.b; @@ -3100,16 +3100,18 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = -> Hash Join Hash Cond: (t2_1.b = t1_1.a) -> Seq Scan on prt2_adv_p1 t2_1 + Filter: (b < 300) -> Hash -> Seq Scan on prt1_adv_p1 t1_1 Filter: ((a < 300) AND (b = 0)) -> Hash Join Hash Cond: (t2_2.b = t1_2.a) -> Seq Scan on prt2_adv_p2 t2_2 + Filter: (b < 300) -> Hash -> Seq Scan on prt1_adv_p2 t1_2 Filter: ((a < 300) AND (b = 0)) -(15 rows) +(17 rows) SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.a < 300 AND t1.b = 0 ORDER BY t1.a, t2.b; a | c | b | c @@ -3139,16 +3141,19 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = -> Hash Join Hash Cond: (t2_1.b = t1_1.a) -> Seq Scan on prt2_adv_p1 t2_1 + Filter: ((b >= 100) AND (b < 300)) -> Hash -> Seq Scan on prt1_adv_p1 t1_1 Filter: ((a >= 100) AND (a < 300) AND (b = 0)) - -> Hash Join - Hash Cond: (t2_2.b = t1_2.a) - -> Seq Scan on prt2_adv_p2 t2_2 - -> Hash + -> Merge Join + Merge Cond: (t2_2.b = t1_2.a) + -> Index Scan using prt2_adv_p2_b_idx on prt2_adv_p2 t2_2 + Index Cond: ((b >= 100) AND (b < 300)) + -> Sort + Sort Key: t1_2.a -> Seq Scan on prt1_adv_p2 t1_2 Filter: ((a >= 100) AND (a < 300) AND (b = 0)) -(15 rows) +(18 rows) SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.a >= 100 AND t1.a < 300 AND t1.b = 0 ORDER BY t1.a, t2.b; a | c | b | c @@ -4444,16 +4449,18 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = -> Hash Join Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c)) -> Seq Scan on plt2_adv_p3 t2_1 + Filter: (c = ANY ('{0003,0004,0005}'::text[])) -> Hash -> Seq Scan on plt1_adv_p3 t1_1 Filter: ((b < 10) AND (c = ANY ('{0003,0004,0005}'::text[]))) -> Hash Join Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c)) -> Seq Scan on plt2_adv_p4 t2_2 + Filter: (c = ANY ('{0003,0004,0005}'::text[])) -> Hash -> Seq Scan on plt1_adv_p4 t1_2 Filter: ((b < 10) AND (c = ANY ('{0003,0004,0005}'::text[]))) -(15 rows) +(17 rows) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c) WHERE t1.c IN ('0003', '0004', '0005') AND t1.b < 10 ORDER BY t1.a; a | c | a | c @@ -4497,16 +4504,18 @@ SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = -> Hash Join Hash Cond: ((t2_1.a = t1_1.a) AND (t2_1.c = t1_1.c)) -> Seq Scan on plt2_adv_p3 t2_1 + Filter: (c = ANY ('{0003,0004,0005}'::text[])) -> Hash -> Seq Scan on plt1_adv_p3 t1_1 Filter: ((b < 10) AND (c = ANY ('{0003,0004,0005}'::text[]))) -> Hash Join Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.c = t1_2.c)) -> Seq Scan on plt2_adv_p4 t2_2 + Filter: (c = ANY ('{0003,0004,0005}'::text[])) -> Hash -> Seq Scan on plt1_adv_p4 t1_2 Filter: ((b < 10) AND (c = ANY ('{0003,0004,0005}'::text[]))) -(15 rows) +(17 rows) SELECT t1.a, t1.c, t2.a, t2.c FROM plt1_adv t1 INNER JOIN plt2_adv t2 ON (t1.a = t2.a AND t1.c = t2.c) WHERE t1.c IN ('0003', '0004', '0005') AND t1.b < 10 ORDER BY t1.a; a | c | a | c @@ -4692,27 +4701,32 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2 Filter: ((b >= 125) AND (b < 225)) -> Hash -> Seq Scan on beta_neg_p1 t2_1 + Filter: ((b >= 125) AND (b < 225)) -> Hash Join - Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.b = t1_2.b)) - -> Seq Scan on beta_neg_p2 t2_2 + Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b)) + -> Seq Scan on alpha_neg_p2 t1_2 + Filter: ((b >= 125) AND (b < 225)) -> Hash - -> Seq Scan on alpha_neg_p2 t1_2 + -> Seq Scan on beta_neg_p2 t2_2 Filter: ((b >= 125) AND (b < 225)) -> Hash Join - Hash Cond: ((t2_4.a = t1_4.a) AND (t2_4.b = t1_4.b)) + Hash Cond: ((t1_4.a = t2_4.a) AND (t1_4.b = t2_4.b)) -> Append - -> Seq Scan on beta_pos_p1 t2_4 - -> Seq Scan on beta_pos_p2 t2_5 - -> Seq Scan on beta_pos_p3 t2_6 + -> Seq Scan on alpha_pos_p1 t1_4 + Filter: ((b >= 125) AND (b < 225)) + -> Seq Scan on alpha_pos_p2 t1_5 + Filter: ((b >= 125) AND (b < 225)) + -> Seq Scan on alpha_pos_p3 t1_6 + Filter: ((b >= 125) AND (b < 225)) -> Hash -> Append - -> Seq Scan on alpha_pos_p1 t1_4 + -> Seq Scan on beta_pos_p1 t2_4 Filter: ((b >= 125) AND (b < 225)) - -> Seq Scan on alpha_pos_p2 t1_5 + -> Seq Scan on beta_pos_p2 t2_5 Filter: ((b >= 125) AND (b < 225)) - -> Seq Scan on alpha_pos_p3 t1_6 + -> Seq Scan on beta_pos_p3 t2_6 Filter: ((b >= 125) AND (b < 225)) -(29 rows) +(34 rows) SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b) WHERE t1.b >= 125 AND t1.b < 225 ORDER BY t1.a, t1.b; a | b | c | a | b | c @@ -4761,8 +4775,8 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2 EXPLAIN (COSTS OFF) SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b, t2.b; - QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------------- Sort Sort Key: t1.a, t1.b, t2.b -> Append @@ -4776,21 +4790,21 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2 -> Hash -> Append -> Seq Scan on beta_neg_p1 t2_2 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on beta_neg_p2 t2_3 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Nested Loop Join Filter: ((t1_4.a = t2_4.a) AND (t1_4.c = t2_4.c)) -> Seq Scan on alpha_pos_p2 t1_4 Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on beta_pos_p2 t2_4 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Nested Loop Join Filter: ((t1_5.a = t2_5.a) AND (t1_5.c = t2_5.c)) -> Seq Scan on alpha_pos_p3 t1_5 Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on beta_pos_p3 t2_5 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) (28 rows) SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b, t2.b; @@ -4816,38 +4830,40 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2 EXPLAIN (COSTS OFF) SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; - QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------- Sort Sort Key: t1.a, t1.b -> Append - -> Hash Join - Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c)) - -> Seq Scan on alpha_neg_p1 t1_1 - Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) - -> Hash + -> Merge Join + Merge Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c)) + -> Sort + Sort Key: t1_1.a, t1_1.b, t1_1.c + -> Seq Scan on alpha_neg_p1 t1_1 + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) + -> Sort + Sort Key: t2_1.a, t2_1.b, t2_1.c -> Seq Scan on beta_neg_p1 t2_1 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) - -> Hash Join - Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND (t1_2.c = t2_2.c)) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) + -> Nested Loop + Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND (t1_2.c = t2_2.c)) + -> Seq Scan on beta_neg_p2 t2_2 + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on alpha_neg_p2 t1_2 Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) - -> Hash - -> Seq Scan on beta_neg_p2 t2_2 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) -> Nested Loop Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.b = t2_3.b) AND (t1_3.c = t2_3.c)) -> Seq Scan on alpha_pos_p2 t1_3 Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on beta_pos_p2 t2_3 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Nested Loop Join Filter: ((t1_4.a = t2_4.a) AND (t1_4.b = t2_4.b) AND (t1_4.c = t2_4.c)) -> Seq Scan on alpha_pos_p3 t1_4 Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) -> Seq Scan on beta_pos_p3 t2_4 - Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210))) -(29 rows) + Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))) +(31 rows) SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >= 100 AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210)) AND t1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; a | b | c | a | b | c diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index 7555764c779..3fb515cdd9d 100644 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -2104,19 +2104,7 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on Index Cond: (a = a.a) -> Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N) Index Cond: (a = a.a) - -> Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed) - Index Cond: (a = a.a) -(27 rows) +(15 rows) -- Ensure the same partitions are pruned when we make the nested loop -- parameter an Expr rather than a plain Param. @@ -2171,19 +2159,13 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on Index Cond: (a = a.a) -> Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N) Index Cond: (a = a.a) - -> Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (actual rows=N loops=N) + -> Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_4 (actual rows=N loops=N) Index Cond: (a = a.a) - -> Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (actual rows=N loops=N) + -> Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_5 (actual rows=N loops=N) Index Cond: (a = a.a) - -> Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (actual rows=N loops=N) + -> Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_6 (actual rows=N loops=N) Index Cond: (a = a.a) -(27 rows) +(21 rows) select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)'); explain_parallel_append @@ -2204,19 +2186,7 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on Index Cond: (a = a.a) -> Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (actual rows=N loops=N) Index Cond: (a = a.a) - -> Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed) - Index Cond: (a = a.a) -(28 rows) +(16 rows) delete from lprt_a where a = 1; select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on ab.a = a.a where a.a in(1, 0, 0)'); @@ -2238,19 +2208,7 @@ select explain_parallel_append('select avg(ab.a) from ab inner join lprt_a a on Index Cond: (a = a.a) -> Index Scan using ab_a1_b3_a_idx on ab_a1_b3 ab_3 (never executed) Index Cond: (a = a.a) - -> Index Scan using ab_a2_b1_a_idx on ab_a2_b1 ab_4 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b2_a_idx on ab_a2_b2 ab_5 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a2_b3_a_idx on ab_a2_b3 ab_6 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b1_a_idx on ab_a3_b1 ab_7 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b2_a_idx on ab_a3_b2 ab_8 (never executed) - Index Cond: (a = a.a) - -> Index Scan using ab_a3_b3_a_idx on ab_a3_b3 ab_9 (never executed) - Index Cond: (a = a.a) -(28 rows) +(16 rows) reset enable_hashjoin; reset enable_mergejoin; diff --git a/src/test/regress/sql/equivclass.sql b/src/test/regress/sql/equivclass.sql index 247b0a31055..dae83c41965 100644 --- a/src/test/regress/sql/equivclass.sql +++ b/src/test/regress/sql/equivclass.sql @@ -269,3 +269,15 @@ create temp view overview as select f1::information_schema.sql_identifier as sqli, f2 from undername; explain (costs off) -- this should not require a sort select * from overview where sqli = 'foo' order by sqli; + + +-- test equivalence filters +explain (costs off) + select * from ec0 + inner join ec1 on ec0.ff = ec1.ff + where ec0.ff between 1 and 10; + +explain (costs off) + select * from ec0 + inner join ec1 on ec0.ff = ec1.ff + where ec1.ff between 1 and 10; -- 2.21.0