Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
On Mon, Feb 1, 2021 at 12:09 PM Masahiko Sawada wrote: > > Hi, > > On Fri, Nov 13, 2020 at 5:14 AM Tom Lane wrote: > > > > I started looking through this patch. I really quite dislike solving > > this via a kluge in indxpath.c. There are multiple disadvantages > > to that: > > > > * It only helps for the very specific problem of redundant bitmap > > index scans, whereas the problem of applying redundant qual checks > > in partition scans seems pretty general. > > > > * It's not unlikely that this will end up trying to make the same > > proof multiple times (and the lack of any way to turn that off, > > through constraint_exclusion or some other knob, isn't too cool). > > > > * It does nothing to fix rowcount estimates in the light of the > > knowledge that some of the restriction clauses are no-ops. Now, > > if we have up-to-date stats we'll probably manage to come out with > > an appropriate 0 or 1 selectivity anyway, but we might not have those. > > In any case, spending significant effort to estimate a selectivity > > when some other part of the code has taken the trouble to *prove* the > > clause true or false seems very undesirable. > > > > * I'm not even convinced that the logic is correct, specifically that > > it's okay to just "continue" if we refute the OR clause. That seems > > likely to break generate_bitmap_or_paths' surrounding loop logic about > > "We must be able to match at least one index to each of the arms of > > the OR". At least, if that still works it requires more than zero > > commentary about why. > > > > > > So I like much better the idea of Konstantin's old patch, that we modify > > the rel's baserestrictinfo list by removing quals that we can prove > > true. We could extend that to solve the bitmapscan problem by removing > > OR arms that we can prove false. So I started to review that patch more > > carefully, and after awhile realized that it has a really fundamental > > problem: it is trying to use CHECK predicates to prove WHERE clauses. > > But we don't know that CHECK predicates are true, only that they are > > not-false, and there is no proof mode in predtest.c that will allow > > proving some clauses true based only on other ones being not-false. > > > > We can salvage something by restricting the input quals to be only > > partition quals, since those are built to be guaranteed-true-or-false; > > we can assume they don't yield NULL. There's a hole in that for > > hashing, as I noted elsewhere, but we'll fail to prove anything anyway > > from a satisfies_hash_partition() qual. (In principle we could also use > > attnotnull quals, which also have that property. But I'm dubious that > > that will help often enough to be worth the extra cycles for predtest.c > > to process them.) > > > > So after a bit of coding I had the attached. This follows Konstantin's > > original patch in letting relation_excluded_by_constraints() change > > the baserestrictinfo list. I read the comments in the older thread > > about people not liking that, and I can see the point. But I'm not > > convinced that the later iterations of the patch were an improvement, > > because (a) the call locations for > > remove_restrictions_implied_by_constraints() seemed pretty random, and > > (b) it seems necessary to have relation_excluded_by_constraints() and > > remove_restrictions_implied_by_constraints() pretty much in bed with > > each other if we don't want to duplicate constraint-fetching work. > > Note the comment on get_relation_constraints() that it's called at > > most once per relation; that's not something I particularly desire > > to give up, because a relcache open isn't terribly cheap. Also > > (c) I think it's important that there be a way to suppress this > > overhead when it's not useful. In the patch as attached, turning off > > constraint_exclusion does that since relation_excluded_by_constraints() > > falls out before getting to the new code. If we make > > remove_restrictions_implied_by_constraints() independent then it > > will need some possibly-quite-duplicative logic to check > > constraint_exclusion. (Of course, if we'd rather condition this > > on some other GUC then that argument falls down. But I think we > > need something.) So, I'm not dead set on this code structure, but > > I haven't seen one I like better. > > > > Anyway, this seems to work, and if the regression test changes are > > any guide then it may fire often enough in the real world to be useful. > > Nonetheless, I'm concerned about performance, because predtest.c is a > > pretty expensive thing and there will be a lot of cases where the work > > is useless. I did a quick check using pgbench's option to partition > > the tables, and observed that the -S (select only) test case seemed to > > get about 2.5% slower with the patch than without. That's not far > > outside the noise floor, so maybe it's not real, but if it is real then > > it seems pretty disastrous. Perhaps we could avoid that
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
Hi, On Fri, Nov 13, 2020 at 5:14 AM Tom Lane wrote: > > I started looking through this patch. I really quite dislike solving > this via a kluge in indxpath.c. There are multiple disadvantages > to that: > > * It only helps for the very specific problem of redundant bitmap > index scans, whereas the problem of applying redundant qual checks > in partition scans seems pretty general. > > * It's not unlikely that this will end up trying to make the same > proof multiple times (and the lack of any way to turn that off, > through constraint_exclusion or some other knob, isn't too cool). > > * It does nothing to fix rowcount estimates in the light of the > knowledge that some of the restriction clauses are no-ops. Now, > if we have up-to-date stats we'll probably manage to come out with > an appropriate 0 or 1 selectivity anyway, but we might not have those. > In any case, spending significant effort to estimate a selectivity > when some other part of the code has taken the trouble to *prove* the > clause true or false seems very undesirable. > > * I'm not even convinced that the logic is correct, specifically that > it's okay to just "continue" if we refute the OR clause. That seems > likely to break generate_bitmap_or_paths' surrounding loop logic about > "We must be able to match at least one index to each of the arms of > the OR". At least, if that still works it requires more than zero > commentary about why. > > > So I like much better the idea of Konstantin's old patch, that we modify > the rel's baserestrictinfo list by removing quals that we can prove > true. We could extend that to solve the bitmapscan problem by removing > OR arms that we can prove false. So I started to review that patch more > carefully, and after awhile realized that it has a really fundamental > problem: it is trying to use CHECK predicates to prove WHERE clauses. > But we don't know that CHECK predicates are true, only that they are > not-false, and there is no proof mode in predtest.c that will allow > proving some clauses true based only on other ones being not-false. > > We can salvage something by restricting the input quals to be only > partition quals, since those are built to be guaranteed-true-or-false; > we can assume they don't yield NULL. There's a hole in that for > hashing, as I noted elsewhere, but we'll fail to prove anything anyway > from a satisfies_hash_partition() qual. (In principle we could also use > attnotnull quals, which also have that property. But I'm dubious that > that will help often enough to be worth the extra cycles for predtest.c > to process them.) > > So after a bit of coding I had the attached. This follows Konstantin's > original patch in letting relation_excluded_by_constraints() change > the baserestrictinfo list. I read the comments in the older thread > about people not liking that, and I can see the point. But I'm not > convinced that the later iterations of the patch were an improvement, > because (a) the call locations for > remove_restrictions_implied_by_constraints() seemed pretty random, and > (b) it seems necessary to have relation_excluded_by_constraints() and > remove_restrictions_implied_by_constraints() pretty much in bed with > each other if we don't want to duplicate constraint-fetching work. > Note the comment on get_relation_constraints() that it's called at > most once per relation; that's not something I particularly desire > to give up, because a relcache open isn't terribly cheap. Also > (c) I think it's important that there be a way to suppress this > overhead when it's not useful. In the patch as attached, turning off > constraint_exclusion does that since relation_excluded_by_constraints() > falls out before getting to the new code. If we make > remove_restrictions_implied_by_constraints() independent then it > will need some possibly-quite-duplicative logic to check > constraint_exclusion. (Of course, if we'd rather condition this > on some other GUC then that argument falls down. But I think we > need something.) So, I'm not dead set on this code structure, but > I haven't seen one I like better. > > Anyway, this seems to work, and if the regression test changes are > any guide then it may fire often enough in the real world to be useful. > Nonetheless, I'm concerned about performance, because predtest.c is a > pretty expensive thing and there will be a lot of cases where the work > is useless. I did a quick check using pgbench's option to partition > the tables, and observed that the -S (select only) test case seemed to > get about 2.5% slower with the patch than without. That's not far > outside the noise floor, so maybe it's not real, but if it is real then > it seems pretty disastrous. Perhaps we could avoid that problem by > removing the "predicate_implied_by" cases and only trying the > "predicate_refuted_by" case, so that no significant time is added > unless you've got an OR restriction clause on a partitioned table. > That seems
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
I started looking through this patch. I really quite dislike solving this via a kluge in indxpath.c. There are multiple disadvantages to that: * It only helps for the very specific problem of redundant bitmap index scans, whereas the problem of applying redundant qual checks in partition scans seems pretty general. * It's not unlikely that this will end up trying to make the same proof multiple times (and the lack of any way to turn that off, through constraint_exclusion or some other knob, isn't too cool). * It does nothing to fix rowcount estimates in the light of the knowledge that some of the restriction clauses are no-ops. Now, if we have up-to-date stats we'll probably manage to come out with an appropriate 0 or 1 selectivity anyway, but we might not have those. In any case, spending significant effort to estimate a selectivity when some other part of the code has taken the trouble to *prove* the clause true or false seems very undesirable. * I'm not even convinced that the logic is correct, specifically that it's okay to just "continue" if we refute the OR clause. That seems likely to break generate_bitmap_or_paths' surrounding loop logic about "We must be able to match at least one index to each of the arms of the OR". At least, if that still works it requires more than zero commentary about why. So I like much better the idea of Konstantin's old patch, that we modify the rel's baserestrictinfo list by removing quals that we can prove true. We could extend that to solve the bitmapscan problem by removing OR arms that we can prove false. So I started to review that patch more carefully, and after awhile realized that it has a really fundamental problem: it is trying to use CHECK predicates to prove WHERE clauses. But we don't know that CHECK predicates are true, only that they are not-false, and there is no proof mode in predtest.c that will allow proving some clauses true based only on other ones being not-false. We can salvage something by restricting the input quals to be only partition quals, since those are built to be guaranteed-true-or-false; we can assume they don't yield NULL. There's a hole in that for hashing, as I noted elsewhere, but we'll fail to prove anything anyway from a satisfies_hash_partition() qual. (In principle we could also use attnotnull quals, which also have that property. But I'm dubious that that will help often enough to be worth the extra cycles for predtest.c to process them.) So after a bit of coding I had the attached. This follows Konstantin's original patch in letting relation_excluded_by_constraints() change the baserestrictinfo list. I read the comments in the older thread about people not liking that, and I can see the point. But I'm not convinced that the later iterations of the patch were an improvement, because (a) the call locations for remove_restrictions_implied_by_constraints() seemed pretty random, and (b) it seems necessary to have relation_excluded_by_constraints() and remove_restrictions_implied_by_constraints() pretty much in bed with each other if we don't want to duplicate constraint-fetching work. Note the comment on get_relation_constraints() that it's called at most once per relation; that's not something I particularly desire to give up, because a relcache open isn't terribly cheap. Also (c) I think it's important that there be a way to suppress this overhead when it's not useful. In the patch as attached, turning off constraint_exclusion does that since relation_excluded_by_constraints() falls out before getting to the new code. If we make remove_restrictions_implied_by_constraints() independent then it will need some possibly-quite-duplicative logic to check constraint_exclusion. (Of course, if we'd rather condition this on some other GUC then that argument falls down. But I think we need something.) So, I'm not dead set on this code structure, but I haven't seen one I like better. Anyway, this seems to work, and if the regression test changes are any guide then it may fire often enough in the real world to be useful. Nonetheless, I'm concerned about performance, because predtest.c is a pretty expensive thing and there will be a lot of cases where the work is useless. I did a quick check using pgbench's option to partition the tables, and observed that the -S (select only) test case seemed to get about 2.5% slower with the patch than without. That's not far outside the noise floor, so maybe it's not real, but if it is real then it seems pretty disastrous. Perhaps we could avoid that problem by removing the "predicate_implied_by" cases and only trying the "predicate_refuted_by" case, so that no significant time is added unless you've got an OR restriction clause on a partitioned table. That seems like it'd lose a lot of the benefit though :-(. So I'm not sure where to go from here. Thoughts? Anyone else care to run some performance tests? regards, tom lane diff --git
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested I think that work on improving operator_predicate_proof should really be done in separate patch. And this minimal patch is doing it's work well. The new status of this patch is: Ready for Committer
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
On Wed, Sep 30, 2020 at 04:52:02PM -0700, Soumyadeep Chakraborty wrote: > Hi Justin, > > Attached is a minimal patch that is rebased and removes the > dependency on Konstantin's earlier patch[1], making it enough to remove > the extraneous index scans as Justin brought up. Is this the direction we > want to head in? Yes, thanks for doing that. I hadn't dug into it yet to figure out what to put where to separate the patches. It seems like my patch handles a different goal than Konstantin's, but they both depend on having the constraints populated. -- Justin
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
Hi Justin, Attached is a minimal patch that is rebased and removes the dependency on Konstantin's earlier patch[1], making it enough to remove the extraneous index scans as Justin brought up. Is this the direction we want to head in? Tagging Konstantin in the email to hear his input on his old patch. Since Justin's attached patch [1] does not include the work that was done on the operator_predicate_proof() and as discussed here in [2], that work is important to see real benefits? Just wanted to check before reviewing [1]. Regards, Soumyadeep (VMware) [1] https://www.postgresql.org/message-id/attachment/112074/0001-Secondary-index-access-optimizations.patch [2] https://www.postgresql.org/message-id/5A006016.1010009%40postgrespro.ru From c60d00a64c7b65d785cda1aad7e780388ab32e27 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Tue, 29 Sep 2020 17:09:45 -0700 Subject: [PATCH v3 1/1] Avoid index scan inconsistent with partition constraint --- src/backend/optimizer/path/allpaths.c | 7 ++ src/backend/optimizer/path/indxpath.c | 5 + src/backend/optimizer/util/plancat.c | 7 +- src/include/optimizer/plancat.h| 6 ++ src/test/regress/expected/create_index.out | 25 ++ src/test/regress/sql/create_index.sql | 10 + 6 files changed, 54 insertions(+), 6 deletions(-) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index b399592ff815..a6a643b9091f 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -1045,6 +1045,13 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, continue; } + /* + * Ensure that rel->partition_qual is set, so we can use the information + * to eliminate unnecessary index scans. + */ + if(childRTE->relid != InvalidOid) + get_relation_constraints(root, childRTE->relid, childrel, false, false, true); + /* * Constraint exclusion failed, so copy the parent's join quals and * targetlist to the child, with appropriate variable substitutions. diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index bcb1bc6097d0..0532b3ddd0d6 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -1305,6 +1305,11 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel, Assert(!restriction_is_or_clause(rinfo)); orargs = list_make1(rinfo); +/* Avoid scanning indexes using a scan condition which is + * inconsistent with the partition constraint */ +if (predicate_refuted_by(rel->partition_qual, orargs, false)) + continue; + indlist = build_paths_for_OR(root, rel, orargs, all_clauses); diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index f9d0d67aa75a..1e0bac471ff3 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -64,11 +64,6 @@ static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel, Relation relation, bool inhparent); static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel, List *idxExprs); -static List *get_relation_constraints(PlannerInfo *root, - Oid relationObjectId, RelOptInfo *rel, - bool include_noinherit, - bool include_notnull, - bool include_partition); static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index, Relation heapRelation); static List *get_relation_statistics(RelOptInfo *rel, Relation relation); @@ -1165,7 +1160,7 @@ get_relation_data_width(Oid relid, int32 *attr_widths) * run, and in many cases it won't be invoked at all, so there seems no * point in caching the data in RelOptInfo. */ -static List * +List * get_relation_constraints(PlannerInfo *root, Oid relationObjectId, RelOptInfo *rel, bool include_noinherit, diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h index c29a7091ec04..cadfee15a34a 100644 --- a/src/include/optimizer/plancat.h +++ b/src/include/optimizer/plancat.h @@ -39,6 +39,12 @@ extern int32 get_relation_data_width(Oid relid, int32 *attr_widths); extern bool relation_excluded_by_constraints(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte); +extern List *get_relation_constraints(PlannerInfo *root, + Oid relationObjectId, RelOptInfo *rel, + bool include_noinherit, + bool include_notnull, + bool include_partition); + extern List *build_physical_tlist(PlannerInfo *root, RelOptInfo *rel); extern bool has_unique_index(RelOptInfo *rel, AttrNumber attno); diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 6ace7662ee1f..3d67978232d3 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -1843,6
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
Rebased and updated for tests added in 13838740f. -- Justin Pryzby System Administrator Telsasoft +1-952-707-8581 >From 9272dda812d2b959d0bd536e0679f8f8527da7b1 Mon Sep 17 00:00:00 2001 From: Konstantin Knizhnik Date: Fri, 12 Oct 2018 15:53:51 +0300 Subject: [PATCH v3 1/2] Secondary index access optimizations --- .../postgres_fdw/expected/postgres_fdw.out| 8 +- contrib/postgres_fdw/sql/postgres_fdw.sql | 2 +- src/backend/optimizer/path/allpaths.c | 2 + src/backend/optimizer/util/plancat.c | 45 ++ src/include/optimizer/plancat.h | 3 + src/test/regress/expected/create_table.out| 14 +- src/test/regress/expected/inherit.out | 123 ++-- .../regress/expected/partition_aggregate.out | 10 +- src/test/regress/expected/partition_join.out | 42 +- src/test/regress/expected/partition_prune.out | 587 ++ src/test/regress/expected/rowsecurity.out | 12 +- src/test/regress/expected/update.out | 4 +- 12 files changed, 322 insertions(+), 530 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 90db550b92..dbbae1820e 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 83971665e3..08aef9289e 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -304,7 +304,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 6da0dcd61c..a9171c075c 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -387,6 +387,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind == RELKIND_FOREIGN_TABLE) { /* Foreign table */ @@ -1040,6 +1041,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, set_dummy_rel_pathlist(childrel); continue; } + remove_restrictions_implied_by_constraints(root, childrel, childRTE); /* * Constraint exclusion failed, so copy the parent's join quals and diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 25545029d7..45cd72a0fe 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1557,6 +1557,51 @@ relation_excluded_by_constraints(PlannerInfo *root, return false; } +/* + * Remove from restrictions list items implied by table constraints + */ +void remove_restrictions_implied_by_constraints(PlannerInfo *root, +RelOptInfo *rel, RangeTblEntry *rte) +{ + List *constraint_pred; + List *safe_constraints = NIL; + List *safe_restrictions = NIL; + ListCell *lc; + + if (rte->rtekind != RTE_RELATION || rte->inh) + return; + + /* + *
Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
Added here: https://commitfest.postgresql.org/29/2644/ And updated tests to pass following: |commit 689696c7110f148ede8004aae50d7543d05b5587 |Author: Tom Lane |Date: Tue Jul 14 18:56:49 2020 -0400 | |Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join. >From 5c323ffaa8c76e91fff6c9c77019c6d6ad4c627c Mon Sep 17 00:00:00 2001 From: Konstantin Knizhnik Date: Fri, 12 Oct 2018 15:53:51 +0300 Subject: [PATCH v2 1/2] Secondary index access optimizations --- .../postgres_fdw/expected/postgres_fdw.out| 8 +- contrib/postgres_fdw/sql/postgres_fdw.sql | 2 +- src/backend/optimizer/path/allpaths.c | 2 + src/backend/optimizer/util/plancat.c | 45 ++ src/include/optimizer/plancat.h | 3 + src/test/regress/expected/create_table.out| 14 +- src/test/regress/expected/inherit.out | 123 ++-- .../regress/expected/partition_aggregate.out | 10 +- src/test/regress/expected/partition_join.out | 42 +- src/test/regress/expected/partition_prune.out | 571 ++ src/test/regress/expected/rowsecurity.out | 12 +- src/test/regress/expected/update.out | 4 +- 12 files changed, 315 insertions(+), 521 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 90db550b92..dbbae1820e 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL)) (3 rows) -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest - QUERY PLAN -- +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest +QUERY PLAN +-- Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL)) (3 rows) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 83971665e3..08aef9289e 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -304,7 +304,7 @@ RESET enable_nestloop; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest -EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest +EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l) EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!; -- OpExpr(r) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 6da0dcd61c..a9171c075c 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -387,6 +387,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel, switch (rel->rtekind) { case RTE_RELATION: +remove_restrictions_implied_by_constraints(root, rel, rte); if (rte->relkind == RELKIND_FOREIGN_TABLE) { /* Foreign table */ @@ -1040,6 +1041,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel, set_dummy_rel_pathlist(childrel); continue; } + remove_restrictions_implied_by_constraints(root, childrel, childRTE); /* * Constraint exclusion failed, so copy the parent's join quals and diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 25545029d7..45cd72a0fe 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1557,6 +1557,51 @@ relation_excluded_by_constraints(PlannerInfo *root, return false; } +/* + * Remove from restrictions list items implied by table constraints + */ +void remove_restrictions_implied_by_constraints(PlannerInfo *root, +RelOptInfo *rel, RangeTblEntry *rte) +{ + List
avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
(resending to the list) Hi All I started looking into Konstantin's 30 month old thread/patch: |Re: [HACKERS] Secondary index access optimizations https://www.postgresql.org/message-id/27516421-5afa-203c-e22a-8407e9187327%40postgrespro.ru ..to which David directed me 12 months ago: |Subject: Re: scans on table fail to be excluded by partition bounds https://www.postgresql.org/message-id/CAKJS1f_iOmCW11dFzifpDGUgSLAoSTDOjw2tcec%3D7Cgq%2BsR80Q%40mail.gmail.com My complaint at the time was for a query plan like: CREATE TABLE p (i int, j int) PARTITION BY RANGE(i); SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,10)i; \gexec INSERT INTO p SELECT i%99, i%9 FROM generate_series(1,9)i; VACUUM ANALYZE p; CREATE INDEX ON p(i); CREATE INDEX ON p(j); postgres=# explain analyze SELECT * FROM p WHERE (i=10 OR i=20 OR i=30) AND j<2; Append (cost=28.51..283.25 rows=546 width=12) (actual time=0.100..1.364 rows=546 loops=1) -> Bitmap Heap Scan on p2 (cost=28.51..93.51 rows=182 width=12) (actual time=0.099..0.452 rows=182 loops=1) Recheck Cond: ((i = 10) OR (i = 20) OR (i = 30)) Filter: (j < 2) Rows Removed by Filter: 818 Heap Blocks: exact=45 -> BitmapOr (cost=28.51..28.51 rows=1000 width=0) (actual time=0.083..0.083 rows=0 loops=1) -> Bitmap Index Scan on p2_i_idx (cost=0.00..19.79 rows=1000 width=0) (actual time=0.074..0.074 rows=1000 loops=1) Index Cond: (i = 10) -> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1) Index Cond: (i = 20) -> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (i = 30) ... This 2nd and 3rd index scan on p2_i_idx are useless, and benign here, but harmful if we have a large OR list. I tried rebasing Konstantin's patch, but it didn't handle the case of "refuting" inconsistent arms of an "OR" list, so I came up with this. This currently depends on the earlier patch only to call RelationGetPartitionQual, so appears to be mostly a separate issue. I believe the current behavior of "OR" lists is also causing another issue I reported, which a customer hit again last week: https://www.postgresql.org/message-id/20191216184906.ga2...@telsasoft.com |ERROR: could not resize shared memory segment...No space left on device When I looked into it, their explain(format text) was 50MB, due to a list of ~500 "OR" conditions, *each* of which was causing an index scan for each of ~500 partitions, where only one index scan per partition was needed or useful, all the others being inconsistent with the partition constraint. Thus the query ultimately errors when it exceeds a resource limit (maybe no surprise with 8500 index scans). Here, I was trying to create a test case reproducing that error to see if this resolves it, but so far hasn't failed. Tomas has a reproducer with a different (much simpler) plan, though. CREATE TABLE p (i int, j int) PARTITION BY RANGE(i); \pset pager off SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,500)i; \timing off \set quiet \set echo always \gexec \timing on INSERT INTO p SELECT i%5000, i%500 FROM generate_series(1,999)i; VACUUM ANALYZE p; CREATE INDEX ON p(i); CREATE INDEX ON p(j); SELECT format('explain analyze SELECT * FROM p WHERE %s', array_to_string(array_agg('i='||(i*10)::text),' OR ')) FROM generate_series(1,500)i; -- Justin >From bd1f9f93f9bec0585a79bcdf932d1a5e2c0001a0 Mon Sep 17 00:00:00 2001 From: Konstantin Knizhnik Date: Fri, 12 Oct 2018 15:53:51 +0300 Subject: [PATCH 1/2] Secondary index access optimizations --- .../postgres_fdw/expected/postgres_fdw.out| 8 +- contrib/postgres_fdw/sql/postgres_fdw.sql | 2 +- src/backend/optimizer/path/allpaths.c | 2 + src/backend/optimizer/util/plancat.c | 45 ++ src/include/optimizer/plancat.h | 3 + src/test/regress/expected/create_table.out| 14 +- src/test/regress/expected/inherit.out | 123 ++-- .../regress/expected/partition_aggregate.out | 10 +- src/test/regress/expected/partition_join.out | 38 +- src/test/regress/expected/partition_prune.out | 571 ++ src/test/regress/expected/rowsecurity.out | 12 +- src/test/regress/expected/update.out | 4 +- 12 files changed, 313 insertions(+), 519 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 82fc1290ef..7ab8639dc7 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu Remote SQL: SELECT "C 1", c2, c3,