Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2021-02-01 Thread Masahiko Sawada
On Mon, Feb 1, 2021 at 12:09 PM Masahiko Sawada  wrote:
>
> Hi,
>
> On Fri, Nov 13, 2020 at 5:14 AM Tom Lane  wrote:
> >
> > I started looking through this patch.  I really quite dislike solving
> > this via a kluge in indxpath.c.  There are multiple disadvantages
> > to that:
> >
> > * It only helps for the very specific problem of redundant bitmap
> > index scans, whereas the problem of applying redundant qual checks
> > in partition scans seems pretty general.
> >
> > * It's not unlikely that this will end up trying to make the same
> > proof multiple times (and the lack of any way to turn that off,
> > through constraint_exclusion or some other knob, isn't too cool).
> >
> > * It does nothing to fix rowcount estimates in the light of the
> > knowledge that some of the restriction clauses are no-ops.  Now,
> > if we have up-to-date stats we'll probably manage to come out with
> > an appropriate 0 or 1 selectivity anyway, but we might not have those.
> > In any case, spending significant effort to estimate a selectivity
> > when some other part of the code has taken the trouble to *prove* the
> > clause true or false seems very undesirable.
> >
> > * I'm not even convinced that the logic is correct, specifically that
> > it's okay to just "continue" if we refute the OR clause.  That seems
> > likely to break generate_bitmap_or_paths' surrounding loop logic about
> > "We must be able to match at least one index to each of the arms of
> > the OR".  At least, if that still works it requires more than zero
> > commentary about why.
> >
> >
> > So I like much better the idea of Konstantin's old patch, that we modify
> > the rel's baserestrictinfo list by removing quals that we can prove
> > true.  We could extend that to solve the bitmapscan problem by removing
> > OR arms that we can prove false.  So I started to review that patch more
> > carefully, and after awhile realized that it has a really fundamental
> > problem: it is trying to use CHECK predicates to prove WHERE clauses.
> > But we don't know that CHECK predicates are true, only that they are
> > not-false, and there is no proof mode in predtest.c that will allow
> > proving some clauses true based only on other ones being not-false.
> >
> > We can salvage something by restricting the input quals to be only
> > partition quals, since those are built to be guaranteed-true-or-false;
> > we can assume they don't yield NULL.  There's a hole in that for
> > hashing, as I noted elsewhere, but we'll fail to prove anything anyway
> > from a satisfies_hash_partition() qual.  (In principle we could also use
> > attnotnull quals, which also have that property.  But I'm dubious that
> > that will help often enough to be worth the extra cycles for predtest.c
> > to process them.)
> >
> > So after a bit of coding I had the attached.  This follows Konstantin's
> > original patch in letting relation_excluded_by_constraints() change
> > the baserestrictinfo list.  I read the comments in the older thread
> > about people not liking that, and I can see the point.  But I'm not
> > convinced that the later iterations of the patch were an improvement,
> > because (a) the call locations for
> > remove_restrictions_implied_by_constraints() seemed pretty random, and
> > (b) it seems necessary to have relation_excluded_by_constraints() and
> > remove_restrictions_implied_by_constraints() pretty much in bed with
> > each other if we don't want to duplicate constraint-fetching work.
> > Note the comment on get_relation_constraints() that it's called at
> > most once per relation; that's not something I particularly desire
> > to give up, because a relcache open isn't terribly cheap.  Also
> > (c) I think it's important that there be a way to suppress this
> > overhead when it's not useful.  In the patch as attached, turning off
> > constraint_exclusion does that since relation_excluded_by_constraints()
> > falls out before getting to the new code.  If we make
> > remove_restrictions_implied_by_constraints() independent then it
> > will need some possibly-quite-duplicative logic to check
> > constraint_exclusion.  (Of course, if we'd rather condition this
> > on some other GUC then that argument falls down.  But I think we
> > need something.)  So, I'm not dead set on this code structure, but
> > I haven't seen one I like better.
> >
> > Anyway, this seems to work, and if the regression test changes are
> > any guide then it may fire often enough in the real world to be useful.
> > Nonetheless, I'm concerned about performance, because predtest.c is a
> > pretty expensive thing and there will be a lot of cases where the work
> > is useless.  I did a quick check using pgbench's option to partition
> > the tables, and observed that the -S (select only) test case seemed to
> > get about 2.5% slower with the patch than without.  That's not far
> > outside the noise floor, so maybe it's not real, but if it is real then
> > it seems pretty disastrous.  Perhaps we could avoid that 

Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2021-01-31 Thread Masahiko Sawada
Hi,

On Fri, Nov 13, 2020 at 5:14 AM Tom Lane  wrote:
>
> I started looking through this patch.  I really quite dislike solving
> this via a kluge in indxpath.c.  There are multiple disadvantages
> to that:
>
> * It only helps for the very specific problem of redundant bitmap
> index scans, whereas the problem of applying redundant qual checks
> in partition scans seems pretty general.
>
> * It's not unlikely that this will end up trying to make the same
> proof multiple times (and the lack of any way to turn that off,
> through constraint_exclusion or some other knob, isn't too cool).
>
> * It does nothing to fix rowcount estimates in the light of the
> knowledge that some of the restriction clauses are no-ops.  Now,
> if we have up-to-date stats we'll probably manage to come out with
> an appropriate 0 or 1 selectivity anyway, but we might not have those.
> In any case, spending significant effort to estimate a selectivity
> when some other part of the code has taken the trouble to *prove* the
> clause true or false seems very undesirable.
>
> * I'm not even convinced that the logic is correct, specifically that
> it's okay to just "continue" if we refute the OR clause.  That seems
> likely to break generate_bitmap_or_paths' surrounding loop logic about
> "We must be able to match at least one index to each of the arms of
> the OR".  At least, if that still works it requires more than zero
> commentary about why.
>
>
> So I like much better the idea of Konstantin's old patch, that we modify
> the rel's baserestrictinfo list by removing quals that we can prove
> true.  We could extend that to solve the bitmapscan problem by removing
> OR arms that we can prove false.  So I started to review that patch more
> carefully, and after awhile realized that it has a really fundamental
> problem: it is trying to use CHECK predicates to prove WHERE clauses.
> But we don't know that CHECK predicates are true, only that they are
> not-false, and there is no proof mode in predtest.c that will allow
> proving some clauses true based only on other ones being not-false.
>
> We can salvage something by restricting the input quals to be only
> partition quals, since those are built to be guaranteed-true-or-false;
> we can assume they don't yield NULL.  There's a hole in that for
> hashing, as I noted elsewhere, but we'll fail to prove anything anyway
> from a satisfies_hash_partition() qual.  (In principle we could also use
> attnotnull quals, which also have that property.  But I'm dubious that
> that will help often enough to be worth the extra cycles for predtest.c
> to process them.)
>
> So after a bit of coding I had the attached.  This follows Konstantin's
> original patch in letting relation_excluded_by_constraints() change
> the baserestrictinfo list.  I read the comments in the older thread
> about people not liking that, and I can see the point.  But I'm not
> convinced that the later iterations of the patch were an improvement,
> because (a) the call locations for
> remove_restrictions_implied_by_constraints() seemed pretty random, and
> (b) it seems necessary to have relation_excluded_by_constraints() and
> remove_restrictions_implied_by_constraints() pretty much in bed with
> each other if we don't want to duplicate constraint-fetching work.
> Note the comment on get_relation_constraints() that it's called at
> most once per relation; that's not something I particularly desire
> to give up, because a relcache open isn't terribly cheap.  Also
> (c) I think it's important that there be a way to suppress this
> overhead when it's not useful.  In the patch as attached, turning off
> constraint_exclusion does that since relation_excluded_by_constraints()
> falls out before getting to the new code.  If we make
> remove_restrictions_implied_by_constraints() independent then it
> will need some possibly-quite-duplicative logic to check
> constraint_exclusion.  (Of course, if we'd rather condition this
> on some other GUC then that argument falls down.  But I think we
> need something.)  So, I'm not dead set on this code structure, but
> I haven't seen one I like better.
>
> Anyway, this seems to work, and if the regression test changes are
> any guide then it may fire often enough in the real world to be useful.
> Nonetheless, I'm concerned about performance, because predtest.c is a
> pretty expensive thing and there will be a lot of cases where the work
> is useless.  I did a quick check using pgbench's option to partition
> the tables, and observed that the -S (select only) test case seemed to
> get about 2.5% slower with the patch than without.  That's not far
> outside the noise floor, so maybe it's not real, but if it is real then
> it seems pretty disastrous.  Perhaps we could avoid that problem by
> removing the "predicate_implied_by" cases and only trying the
> "predicate_refuted_by" case, so that no significant time is added
> unless you've got an OR restriction clause on a partitioned table.
> That seems 

Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-11-12 Thread Tom Lane
I started looking through this patch.  I really quite dislike solving
this via a kluge in indxpath.c.  There are multiple disadvantages
to that:

* It only helps for the very specific problem of redundant bitmap
index scans, whereas the problem of applying redundant qual checks
in partition scans seems pretty general.

* It's not unlikely that this will end up trying to make the same
proof multiple times (and the lack of any way to turn that off,
through constraint_exclusion or some other knob, isn't too cool).

* It does nothing to fix rowcount estimates in the light of the
knowledge that some of the restriction clauses are no-ops.  Now,
if we have up-to-date stats we'll probably manage to come out with
an appropriate 0 or 1 selectivity anyway, but we might not have those.
In any case, spending significant effort to estimate a selectivity
when some other part of the code has taken the trouble to *prove* the
clause true or false seems very undesirable.

* I'm not even convinced that the logic is correct, specifically that
it's okay to just "continue" if we refute the OR clause.  That seems
likely to break generate_bitmap_or_paths' surrounding loop logic about
"We must be able to match at least one index to each of the arms of
the OR".  At least, if that still works it requires more than zero
commentary about why.


So I like much better the idea of Konstantin's old patch, that we modify
the rel's baserestrictinfo list by removing quals that we can prove
true.  We could extend that to solve the bitmapscan problem by removing
OR arms that we can prove false.  So I started to review that patch more
carefully, and after awhile realized that it has a really fundamental
problem: it is trying to use CHECK predicates to prove WHERE clauses.
But we don't know that CHECK predicates are true, only that they are
not-false, and there is no proof mode in predtest.c that will allow
proving some clauses true based only on other ones being not-false.

We can salvage something by restricting the input quals to be only
partition quals, since those are built to be guaranteed-true-or-false;
we can assume they don't yield NULL.  There's a hole in that for
hashing, as I noted elsewhere, but we'll fail to prove anything anyway
from a satisfies_hash_partition() qual.  (In principle we could also use
attnotnull quals, which also have that property.  But I'm dubious that
that will help often enough to be worth the extra cycles for predtest.c
to process them.)

So after a bit of coding I had the attached.  This follows Konstantin's
original patch in letting relation_excluded_by_constraints() change
the baserestrictinfo list.  I read the comments in the older thread
about people not liking that, and I can see the point.  But I'm not
convinced that the later iterations of the patch were an improvement,
because (a) the call locations for
remove_restrictions_implied_by_constraints() seemed pretty random, and
(b) it seems necessary to have relation_excluded_by_constraints() and
remove_restrictions_implied_by_constraints() pretty much in bed with
each other if we don't want to duplicate constraint-fetching work.
Note the comment on get_relation_constraints() that it's called at
most once per relation; that's not something I particularly desire
to give up, because a relcache open isn't terribly cheap.  Also
(c) I think it's important that there be a way to suppress this
overhead when it's not useful.  In the patch as attached, turning off
constraint_exclusion does that since relation_excluded_by_constraints()
falls out before getting to the new code.  If we make
remove_restrictions_implied_by_constraints() independent then it
will need some possibly-quite-duplicative logic to check
constraint_exclusion.  (Of course, if we'd rather condition this
on some other GUC then that argument falls down.  But I think we
need something.)  So, I'm not dead set on this code structure, but
I haven't seen one I like better.

Anyway, this seems to work, and if the regression test changes are
any guide then it may fire often enough in the real world to be useful.
Nonetheless, I'm concerned about performance, because predtest.c is a
pretty expensive thing and there will be a lot of cases where the work
is useless.  I did a quick check using pgbench's option to partition
the tables, and observed that the -S (select only) test case seemed to
get about 2.5% slower with the patch than without.  That's not far
outside the noise floor, so maybe it's not real, but if it is real then
it seems pretty disastrous.  Perhaps we could avoid that problem by
removing the "predicate_implied_by" cases and only trying the
"predicate_refuted_by" case, so that no significant time is added
unless you've got an OR restriction clause on a partitioned table.
That seems like it'd lose a lot of the benefit though :-(.

So I'm not sure where to go from here.  Thoughts?  Anyone else
care to run some performance tests?

regards, tom lane

diff --git 

Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-11-11 Thread Konstantin Knizhnik
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

I think that work on improving operator_predicate_proof should really be done 
in separate patch.
And this minimal patch is doing it's work well.

The new status of this patch is: Ready for Committer


Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-10-13 Thread Justin Pryzby
On Wed, Sep 30, 2020 at 04:52:02PM -0700, Soumyadeep Chakraborty wrote:
> Hi Justin,
> 
> Attached is a minimal patch that is rebased and removes the
> dependency on Konstantin's earlier patch[1], making it enough to remove
> the extraneous index scans as Justin brought up. Is this the direction we
> want to head in?

Yes, thanks for doing that.  I hadn't dug into it yet to figure out what to put
where to separate the patches.  It seems like my patch handles a different goal
than Konstantin's, but they both depend on having the constraints populated.

-- 
Justin




Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-09-30 Thread Soumyadeep Chakraborty
Hi Justin,

Attached is a minimal patch that is rebased and removes the
dependency on Konstantin's earlier patch[1], making it enough to remove
the extraneous index scans as Justin brought up. Is this the direction we
want to head in?

Tagging Konstantin in the email to hear his input on his old patch.
Since Justin's attached patch [1] does not include the work that was done
on the operator_predicate_proof() and as discussed here in [2], that
work is important to see real benefits? Just wanted to check before
reviewing [1].

Regards,
Soumyadeep (VMware)

[1] 
https://www.postgresql.org/message-id/attachment/112074/0001-Secondary-index-access-optimizations.patch
[2] https://www.postgresql.org/message-id/5A006016.1010009%40postgrespro.ru
From c60d00a64c7b65d785cda1aad7e780388ab32e27 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 29 Sep 2020 17:09:45 -0700
Subject: [PATCH v3 1/1] Avoid index scan inconsistent with partition
 constraint

---
 src/backend/optimizer/path/allpaths.c  |  7 ++
 src/backend/optimizer/path/indxpath.c  |  5 +
 src/backend/optimizer/util/plancat.c   |  7 +-
 src/include/optimizer/plancat.h|  6 ++
 src/test/regress/expected/create_index.out | 25 ++
 src/test/regress/sql/create_index.sql  | 10 +
 6 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index b399592ff815..a6a643b9091f 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1045,6 +1045,13 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			continue;
 		}
 
+		/*
+		 * Ensure that rel->partition_qual is set, so we can use the information
+		 * to eliminate unnecessary index scans.
+		 */
+		if(childRTE->relid != InvalidOid)
+			get_relation_constraints(root, childRTE->relid, childrel, false, false, true);
+
 		/*
 		 * Constraint exclusion failed, so copy the parent's join quals and
 		 * targetlist to the child, with appropriate variable substitutions.
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index bcb1bc6097d0..0532b3ddd0d6 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1305,6 +1305,11 @@ generate_bitmap_or_paths(PlannerInfo *root, RelOptInfo *rel,
 Assert(!restriction_is_or_clause(rinfo));
 orargs = list_make1(rinfo);
 
+/* Avoid scanning indexes using a scan condition which is
+ * inconsistent with the partition constraint */
+if (predicate_refuted_by(rel->partition_qual, orargs, false))
+	continue;
+
 indlist = build_paths_for_OR(root, rel,
 			 orargs,
 			 all_clauses);
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index f9d0d67aa75a..1e0bac471ff3 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -64,11 +64,6 @@ static void get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
 	  Relation relation, bool inhparent);
 static bool infer_collation_opclass_match(InferenceElem *elem, Relation idxRel,
 		  List *idxExprs);
-static List *get_relation_constraints(PlannerInfo *root,
-	  Oid relationObjectId, RelOptInfo *rel,
-	  bool include_noinherit,
-	  bool include_notnull,
-	  bool include_partition);
 static List *build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
 			   Relation heapRelation);
 static List *get_relation_statistics(RelOptInfo *rel, Relation relation);
@@ -1165,7 +1160,7 @@ get_relation_data_width(Oid relid, int32 *attr_widths)
  * run, and in many cases it won't be invoked at all, so there seems no
  * point in caching the data in RelOptInfo.
  */
-static List *
+List *
 get_relation_constraints(PlannerInfo *root,
 		 Oid relationObjectId, RelOptInfo *rel,
 		 bool include_noinherit,
diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h
index c29a7091ec04..cadfee15a34a 100644
--- a/src/include/optimizer/plancat.h
+++ b/src/include/optimizer/plancat.h
@@ -39,6 +39,12 @@ extern int32 get_relation_data_width(Oid relid, int32 *attr_widths);
 extern bool relation_excluded_by_constraints(PlannerInfo *root,
 			 RelOptInfo *rel, RangeTblEntry *rte);
 
+extern List *get_relation_constraints(PlannerInfo *root,
+	  Oid relationObjectId, RelOptInfo *rel,
+	  bool include_noinherit,
+	  bool include_notnull,
+	  bool include_partition);
+
 extern List *build_physical_tlist(PlannerInfo *root, RelOptInfo *rel);
 
 extern bool has_unique_index(RelOptInfo *rel, AttrNumber attno);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 6ace7662ee1f..3d67978232d3 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -1843,6 

Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-08-03 Thread Justin Pryzby
Rebased and updated for tests added in 13838740f.

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581
>From 9272dda812d2b959d0bd536e0679f8f8527da7b1 Mon Sep 17 00:00:00 2001
From: Konstantin Knizhnik 
Date: Fri, 12 Oct 2018 15:53:51 +0300
Subject: [PATCH v3 1/2] Secondary index access optimizations

---
 .../postgres_fdw/expected/postgres_fdw.out|   8 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |   2 +-
 src/backend/optimizer/path/allpaths.c |   2 +
 src/backend/optimizer/util/plancat.c  |  45 ++
 src/include/optimizer/plancat.h   |   3 +
 src/test/regress/expected/create_table.out|  14 +-
 src/test/regress/expected/inherit.out | 123 ++--
 .../regress/expected/partition_aggregate.out  |  10 +-
 src/test/regress/expected/partition_join.out  |  42 +-
 src/test/regress/expected/partition_prune.out | 587 ++
 src/test/regress/expected/rowsecurity.out |  12 +-
 src/test/regress/expected/update.out  |   4 +-
 12 files changed, 322 insertions(+), 530 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550b92..dbbae1820e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83971665e3..08aef9289e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -304,7 +304,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6da0dcd61c..a9171c075c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -387,6 +387,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1040,6 +1041,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/*
 		 * Constraint exclusion failed, so copy the parent's join quals and
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 25545029d7..45cd72a0fe 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1557,6 +1557,51 @@ relation_excluded_by_constraints(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Remove from restrictions list items implied by table constraints
+ */
+void remove_restrictions_implied_by_constraints(PlannerInfo *root,
+RelOptInfo *rel, RangeTblEntry *rte)
+{
+	List	   *constraint_pred;
+	List	   *safe_constraints = NIL;
+	List	   *safe_restrictions = NIL;
+	ListCell   *lc;
+
+	if (rte->rtekind != RTE_RELATION || rte->inh)
+		return;
+
+	/*
+	 * 

Re: avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-07-14 Thread Justin Pryzby
Added here:
https://commitfest.postgresql.org/29/2644/

And updated tests to pass following:
|commit 689696c7110f148ede8004aae50d7543d05b5587
|Author: Tom Lane 
|Date:   Tue Jul 14 18:56:49 2020 -0400
|
|Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.
>From 5c323ffaa8c76e91fff6c9c77019c6d6ad4c627c Mon Sep 17 00:00:00 2001
From: Konstantin Knizhnik 
Date: Fri, 12 Oct 2018 15:53:51 +0300
Subject: [PATCH v2 1/2] Secondary index access optimizations

---
 .../postgres_fdw/expected/postgres_fdw.out|   8 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |   2 +-
 src/backend/optimizer/path/allpaths.c |   2 +
 src/backend/optimizer/util/plancat.c  |  45 ++
 src/include/optimizer/plancat.h   |   3 +
 src/test/regress/expected/create_table.out|  14 +-
 src/test/regress/expected/inherit.out | 123 ++--
 .../regress/expected/partition_aggregate.out  |  10 +-
 src/test/regress/expected/partition_join.out  |  42 +-
 src/test/regress/expected/partition_prune.out | 571 ++
 src/test/regress/expected/rowsecurity.out |  12 +-
 src/test/regress/expected/update.out  |   4 +-
 12 files changed, 315 insertions(+), 521 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550b92..dbbae1820e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83971665e3..08aef9289e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -304,7 +304,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6da0dcd61c..a9171c075c 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -387,6 +387,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1040,6 +1041,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/*
 		 * Constraint exclusion failed, so copy the parent's join quals and
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 25545029d7..45cd72a0fe 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1557,6 +1557,51 @@ relation_excluded_by_constraints(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Remove from restrictions list items implied by table constraints
+ */
+void remove_restrictions_implied_by_constraints(PlannerInfo *root,
+RelOptInfo *rel, RangeTblEntry *rte)
+{
+	List	   

avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint

2020-07-03 Thread Justin Pryzby
(resending to the list)

Hi All

I started looking into Konstantin's 30 month old thread/patch:
|Re: [HACKERS] Secondary index access optimizations
https://www.postgresql.org/message-id/27516421-5afa-203c-e22a-8407e9187327%40postgrespro.ru

..to which David directed me 12 months ago:
|Subject: Re: scans on table fail to be excluded by partition bounds
https://www.postgresql.org/message-id/CAKJS1f_iOmCW11dFzifpDGUgSLAoSTDOjw2tcec%3D7Cgq%2BsR80Q%40mail.gmail.com

My complaint at the time was for a query plan like:

CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 
10*(i-1), 10*i) FROM generate_series(1,10)i; \gexec
INSERT INTO p SELECT i%99, i%9 FROM generate_series(1,9)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);

postgres=# explain analyze SELECT * FROM p WHERE (i=10 OR i=20 OR i=30) AND j<2;
Append  (cost=28.51..283.25 rows=546 width=12) (actual time=0.100..1.364 
rows=546 loops=1)
  ->  Bitmap Heap Scan on p2  (cost=28.51..93.51 rows=182 width=12) (actual 
time=0.099..0.452 rows=182 loops=1)
Recheck Cond: ((i = 10) OR (i = 20) OR (i = 30))
Filter: (j < 2)
Rows Removed by Filter: 818
Heap Blocks: exact=45
->  BitmapOr  (cost=28.51..28.51 rows=1000 width=0) (actual 
time=0.083..0.083 rows=0 loops=1)
  ->  Bitmap Index Scan on p2_i_idx  (cost=0.00..19.79 rows=1000 
width=0) (actual time=0.074..0.074 rows=1000 loops=1)
Index Cond: (i = 10)
  ->  Bitmap Index Scan on p2_i_idx  (cost=0.00..4.29 rows=1 
width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (i = 20)
  ->  Bitmap Index Scan on p2_i_idx  (cost=0.00..4.29 rows=1 
width=0) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (i = 30)
...

This 2nd and 3rd index scan on p2_i_idx are useless, and benign here, but
harmful if we have a large OR list.

I tried rebasing Konstantin's patch, but it didn't handle the case of
"refuting" inconsistent arms of an "OR" list, so I came up with this.  This
currently depends on the earlier patch only to call RelationGetPartitionQual,
so appears to be mostly a separate issue.

I believe the current behavior of "OR" lists is also causing another issue I
reported, which a customer hit again last week:
https://www.postgresql.org/message-id/20191216184906.ga2...@telsasoft.com
|ERROR: could not resize shared memory segment...No space left on device

When I looked into it, their explain(format text) was 50MB, due to a list of
~500 "OR" conditions, *each* of which was causing an index scan for each of
~500 partitions, where only one index scan per partition was needed or useful,
all the others being inconsistent with the partition constraint.  Thus the
query ultimately errors when it exceeds a resource limit (maybe no surprise
with 8500 index scans).

Here, I was trying to create a test case reproducing that error to see if this
resolves it, but so far hasn't failed. Tomas has a reproducer with a different
(much simpler) plan, though.

CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
\pset pager off
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 
10*(i-1), 10*i) FROM generate_series(1,500)i;
\timing off
\set quiet
\set echo always
\gexec
\timing on
INSERT INTO p SELECT i%5000, i%500 FROM generate_series(1,999)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);
SELECT format('explain analyze SELECT * FROM p WHERE %s', 
array_to_string(array_agg('i='||(i*10)::text),' OR ')) FROM 
generate_series(1,500)i;

-- 
Justin
>From bd1f9f93f9bec0585a79bcdf932d1a5e2c0001a0 Mon Sep 17 00:00:00 2001
From: Konstantin Knizhnik 
Date: Fri, 12 Oct 2018 15:53:51 +0300
Subject: [PATCH 1/2] Secondary index access optimizations

---
 .../postgres_fdw/expected/postgres_fdw.out|   8 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |   2 +-
 src/backend/optimizer/path/allpaths.c |   2 +
 src/backend/optimizer/util/plancat.c  |  45 ++
 src/include/optimizer/plancat.h   |   3 +
 src/test/regress/expected/create_table.out|  14 +-
 src/test/regress/expected/inherit.out | 123 ++--
 .../regress/expected/partition_aggregate.out  |  10 +-
 src/test/regress/expected/partition_join.out  |  38 +-
 src/test/regress/expected/partition_prune.out | 571 ++
 src/test/regress/expected/rowsecurity.out |  12 +-
 src/test/regress/expected/update.out  |   4 +-
 12 files changed, 313 insertions(+), 519 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 82fc1290ef..7ab8639dc7 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -629,12 +629,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3,