Re: missing indexes in indexlist with partitioned tables

Arne Roland Tue, 02 Aug 2022 16:07:33 -0700

Hi!


Attached a rebased version of the patch.


Regards
Arne


________________________________
From: Arne Roland
Sent: Monday, January 31, 2022 19:14
To: Amit Langote
Cc: Zhihong Yu; Alvaro Herrera; Julien Rouhaud; pgsql-hackers
Subject: Re: missing indexes in indexlist with partitioned tables

Hi!

From: Amit Langote <amitlangot...@gmail.com>
Sent: Tuesday, January 25, 2022 09:04
Subject: Re: missing indexes in indexlist with partitioned tables
>  [...]
> "partindexlist" really made me think about a list of "partial indexes"
> for some reason.  I think maybe "partedindexlist" is what you are
> looking for; "parted" is commonly used as short for "partitioned" when
> naming variables.
>
> The comment only mentions "further pruning" as to what partitioned
> indexes are to be remembered in RelOptInfo, but it's not clear what
> that means.  It may help to be more specific.

Thanks for the feedback! I've changed that. The current version is attached.

> Finally, I don't understand why we need a separate field to store
> indexes found in partitioned base relations.  AFAICS, nothing but the
> sites you are interested in (relation_has_unique_index_for() and
> rel_supports_distinctness()) would ever bother to look at a
> partitioned base relation's indexlist.  Do you think putting them into
> in indexlist might break something?

I have thought about that before. AFAICT there is nothing in core, which 
breaks. However I am not sure, I want to mix those two kinds of index nodes. 
First of all the structure is different, partedIndexes don't have physical 
attributes after all. This is technical implementation detail relating to the 
current promise, that entries of the indexlist are indexes we can use. And by 
use, I mean use for statistics or the executor.
I'm more concerned about future changes regarding the order and optimization of 
processing harder here. The order in which we do things in the planner is a bit 
messy, and I wouldn't mind seeing details about that change. Looking at the 
current wacky order in the optimizer, I'm not convinced, that nothing will want 
to have a look at the indexlist, before partitioned tables are unpacked.

Since it would be easy to introduce this new variable later, wouldn't mind 
adding it to the indexlist directly for now. But changing the underlying 
promise of what it contains, seems noteworthy and more intrusive to me.

> > Side note: I personally think the name inhparent is mildly confusing, since 
> > it's not really about inheritance. I don't have a significantly better idea 
> > though.
>
> Partitioned tables are "inheritance parent", so share the same code as
> what traditional inheritance parents have always used for planning.

I recall that manual partitioning via inheritance, that was cumbersome. Though 
that minor historical detail was not, what I was referring to. There are a lot 
of other cases, that cause us to set inhparent. IIRC we use this flag in some 
ddl commands, which have nothing to do with inheritance. It essentially is used 
as a variant to skip the indexlist creation. If such hacks weren't there, we 
could simply check for the relkind and indisunique.

Regards
Arne

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 7d176e7b00..31cdc3a7f7 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -3502,7 +3502,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 	Assert(list_length(exprlist) == list_length(oprlist));
 
 	/* Short-circuit if no indexes... */
-	if (rel->indexlist == NIL)
+	if (rel->indexlist == NIL && rel->partedIndexlist == NIL)
 		return false;
 
 	/*
@@ -3547,7 +3547,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 		return false;
 
 	/* Examine each index of the relation ... */
-	foreach(ic, rel->indexlist)
+	foreach(ic, list_concat(rel->indexlist, rel->partedIndexlist))
 	{
 		IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
 		int			c;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 337f470d58..16ce443ec9 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -23,6 +23,7 @@
 #include "postgres.h"
 
 #include "nodes/nodeFuncs.h"
+#include "nodes/nodes.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/optimizer.h"
@@ -598,7 +599,7 @@ rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel)
 		 */
 		ListCell   *lc;
 
-		foreach(lc, rel->indexlist)
+		foreach(lc, list_concat(rel->indexlist, rel->partedIndexlist))
 		{
 			IndexOptInfo *ind = (IndexOptInfo *) lfirst(lc);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 5012bfe142..7cf85875d5 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -116,8 +116,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 {
 	Index		varno = rel->relid;
 	Relation	relation;
-	bool		hasindex;
 	List	   *indexinfos = NIL;
+	List	   *partedIndexinfos = NIL;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -154,17 +154,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	/* Retrieve the parallel_workers reloption, or -1 if not set. */
 	rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1);
 
-	/*
-	 * Make list of indexes.  Ignore indexes on system catalogs if told to.
-	 * Don't bother with indexes for an inheritance parent, either.
-	 */
-	if (inhparent ||
-		(IgnoreSystemIndexes && IsSystemRelation(relation)))
-		hasindex = false;
-	else
-		hasindex = relation->rd_rel->relhasindex;
-
-	if (hasindex)
+	/* Make list of indexes. Ignore indexes on system catalogs if told to. */
+	if (!(IgnoreSystemIndexes && IsSystemRelation(relation)) && relation->rd_rel->relhasindex)
 	{
 		List	   *indexoidlist;
 		LOCKMODE	lmode;
@@ -213,10 +204,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			}
 
 			/*
-			 * Ignore partitioned indexes, since they are not usable for
-			 * queries.
+			 * Don't add partitioned indexes to the indexlist, since they are
+			 * not usable by the executor. If they are unique add them to the
+			 * partedIndexlist instead, to use for further pruning. That is
+			 * relevant for the join pruning, if the outer relation is partitioned.
+			 * If they aren't that either, simply skip them.
 			 */
-			if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
+			if (inhparent && (!index->indisunique || indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX))
 			{
 				index_close(indexRelation, NoLock);
 				continue;
@@ -264,7 +258,40 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 				info->indexcollations[i] = indexRelation->rd_indcollation[i];
 			}
 
-			info->relam = indexRelation->rd_rel->relam;
+			/*
+			 * Fetch the index expressions and predicate, if any.  We must
+			 * modify the copies we obtain from the relcache to have the
+			 * correct varno for the parent relation, so that they match up
+			 * correctly against qual clauses.
+			 */
+			info->indexprs = RelationGetIndexExpressions(indexRelation);
+			info->indpred = RelationGetIndexPredicate(indexRelation);
+			if (info->indexprs && varno != 1)
+				ChangeVarNodes((Node *) info->indexprs, 1, varno, 0);
+			if (info->indpred && varno != 1)
+				ChangeVarNodes((Node *) info->indpred, 1, varno, 0);
+
+			info->unique = index->indisunique;
+			info->immediate = index->indimmediate;
+
+			/*
+			 * Don't add partitioned indexes to the indexlist, add them to the
+			 * partedIndexlist instead, since they are not usable by the
+			 * executor.
+			 */
+			if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
+			{
+				index_close(indexRelation, NoLock);
+				partedIndexinfos = lappend(partedIndexinfos, info);
+				continue;
+			}
+
+			info->hypothetical = false;
+			info->indrestrictinfo = NIL;	/* set later, in indxpath.c */
+			info->predOK = false;	/* set later, in indxpath.c */
+
+			/* Build targetlist using the completed indexprs data */
+			info->indextlist = build_index_tlist(root, info, relation);
 
 			/* We copy just the fields we need, not all of rd_indam */
 			amroutine = indexRelation->rd_indam;
@@ -284,6 +311,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			/* Fetch index opclass options */
 			info->opclassoptions = RelationGetIndexAttOptions(indexRelation, true);
 
+			info->relam = indexRelation->rd_rel->relam;
+
 			/*
 			 * Fetch the ordering information for the index, if any.
 			 */
@@ -370,28 +399,6 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 				info->nulls_first = NULL;
 			}
 
-			/*
-			 * Fetch the index expressions and predicate, if any.  We must
-			 * modify the copies we obtain from the relcache to have the
-			 * correct varno for the parent relation, so that they match up
-			 * correctly against qual clauses.
-			 */
-			info->indexprs = RelationGetIndexExpressions(indexRelation);
-			info->indpred = RelationGetIndexPredicate(indexRelation);
-			if (info->indexprs && varno != 1)
-				ChangeVarNodes((Node *) info->indexprs, 1, varno, 0);
-			if (info->indpred && varno != 1)
-				ChangeVarNodes((Node *) info->indpred, 1, varno, 0);
-
-			/* Build targetlist using the completed indexprs data */
-			info->indextlist = build_index_tlist(root, info, relation);
-
-			info->indrestrictinfo = NIL;	/* set later, in indxpath.c */
-			info->predOK = false;	/* set later, in indxpath.c */
-			info->unique = index->indisunique;
-			info->immediate = index->indimmediate;
-			info->hypothetical = false;
-
 			/*
 			 * Estimate the index size.  If it's not a partial index, we lock
 			 * the number-of-tuples estimate to equal the parent table; if it
@@ -441,6 +448,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	}
 
 	rel->indexlist = indexinfos;
+	rel->partedIndexlist = partedIndexinfos;
 
 	rel->statlist = get_relation_statistics(rel, relation);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3b065139e6..b23bbe2097 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -871,6 +871,8 @@ typedef struct RelOptInfo
 	Relids		lateral_referencers;
 	/* list of IndexOptInfo */
 	List	   *indexlist;
+	/* list of IndexOptInfo for partioned indexes */
+	List	   *partedIndexlist;
 	/* list of StatisticExtInfo */
 	List	   *statlist;
 	/* size estimates derived from pg_class */
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 03926a8413..522cf3ec94 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -4855,14 +4855,42 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
 CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
 CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
 CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
+CREATE TABLE fract_x (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
+CREATE TABLE fract_x0 PARTITION OF fract_x FOR VALUES FROM ('0') TO ('1000');
+CREATE TABLE fract_x1 PARTITION OF fract_x FOR VALUES FROM ('1000') TO ('2000');
 -- insert data
 INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
 ANALYZE fract_t;
--- verify plan; nested index only scans
+INSERT INTO fract_x (id) (SELECT generate_series(0, 1999));
+ANALYZE fract_x;
+SET max_parallel_workers_per_gather = 0;
+SET enable_partitionwise_join = on;
+-- verify partition pruning
 SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+SELECT x.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Limit
+   ->  Append
+         ->  Index Only Scan using fract_x0_pkey on fract_x0 x_1
+         ->  Index Only Scan using fract_x1_pkey on fract_x1 x_2
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT x.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Limit
+   ->  Append
+         ->  Index Only Scan using fract_x0_pkey on fract_x0 x_1
+         ->  Index Only Scan using fract_x1_pkey on fract_x1 x_2
+(4 rows)
+
+-- verify plan; nested index only scans
+EXPLAIN (COSTS OFF)
+SELECT x.id, y.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
                               QUERY PLAN                               
 -----------------------------------------------------------------------
  Limit
@@ -4870,32 +4898,33 @@ SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
          Sort Key: x.id
          ->  Merge Left Join
                Merge Cond: (x_1.id = y_1.id)
-               ->  Index Only Scan using fract_t0_pkey on fract_t0 x_1
+               ->  Index Only Scan using fract_x0_pkey on fract_x0 x_1
                ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
          ->  Merge Left Join
                Merge Cond: (x_2.id = y_2.id)
-               ->  Index Only Scan using fract_t1_pkey on fract_t1 x_2
+               ->  Index Only Scan using fract_x1_pkey on fract_x1 x_2
                ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
 (11 rows)
 
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+SELECT x.id, y.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY x.id DESC LIMIT 10;
                                    QUERY PLAN                                   
 --------------------------------------------------------------------------------
  Limit
    ->  Merge Append
          Sort Key: x.id DESC
          ->  Nested Loop Left Join
-               ->  Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
+               ->  Index Only Scan Backward using fract_x0_pkey on fract_x0 x_1
                ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
                      Index Cond: (id = x_1.id)
          ->  Nested Loop Left Join
-               ->  Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
+               ->  Index Only Scan Backward using fract_x1_pkey on fract_x1 x_2
                ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
                      Index Cond: (id = x_2.id)
 (11 rows)
 
 -- cleanup
 DROP TABLE fract_t;
+DROP TABLE fract_x;
 RESET max_parallel_workers_per_gather;
 RESET enable_partitionwise_join;
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 67f506361f..39e23911d0 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -1148,22 +1148,38 @@ CREATE TABLE fract_t (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
 CREATE TABLE fract_t0 PARTITION OF fract_t FOR VALUES FROM ('0') TO ('1000');
 CREATE TABLE fract_t1 PARTITION OF fract_t FOR VALUES FROM ('1000') TO ('2000');
 
+CREATE TABLE fract_x (id BIGINT, PRIMARY KEY (id)) PARTITION BY RANGE (id);
+CREATE TABLE fract_x0 PARTITION OF fract_x FOR VALUES FROM ('0') TO ('1000');
+CREATE TABLE fract_x1 PARTITION OF fract_x FOR VALUES FROM ('1000') TO ('2000');
 -- insert data
 INSERT INTO fract_t (id) (SELECT generate_series(0, 1999));
 ANALYZE fract_t;
+INSERT INTO fract_x (id) (SELECT generate_series(0, 1999));
+ANALYZE fract_x;
 
--- verify plan; nested index only scans
 SET max_parallel_workers_per_gather = 0;
 SET enable_partitionwise_join = on;
 
+-- verify partition pruning
+SET max_parallel_workers_per_gather = 0;
+SET enable_partitionwise_join = on;
+
+EXPLAIN (COSTS OFF)
+SELECT x.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+
+EXPLAIN (COSTS OFF)
+SELECT x.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+
+-- verify plan; nested index only scans
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id ASC LIMIT 10;
+SELECT x.id, y.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY x.id ASC LIMIT 10;
 
 EXPLAIN (COSTS OFF)
-SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id) ORDER BY id DESC LIMIT 10;
+SELECT x.id, y.id FROM fract_x x LEFT JOIN fract_t y USING (id) ORDER BY x.id DESC LIMIT 10;
 
 -- cleanup
 DROP TABLE fract_t;
+DROP TABLE fract_x;
 
 RESET max_parallel_workers_per_gather;
 RESET enable_partitionwise_join;

Re: missing indexes in indexlist with partitioned tables

Reply via email to