[PoC] Reducing planning time on tables with many indexes

David Geier Wed, 27 Jul 2022 05:43:02 -0700

Hi hackers,

We came across a slowdown in planning, where queries use tables with many
indexes. In setups with wide tables it is not uncommon to have easily
10-100 indexes on a single table. The slowdown is already visible in serial
workloads with just a handful of indexes, but gets drastically amplified
when running queries with more indexes in parallel at high throughput.


We measured the TPS and planning time of running parallel streams of simple
point look-up queries on a single empty table with 60 columns and 60
indexes. The query used is 'SELECT * FROM synth_table WHERE col5 = 42'. No
rows are returned because the table is empty. We used a machine with 64
physical CPU cores. The schema and sysbench script to reproduce these
numbers are attached. We used the TPS as reported by sysbench and obtained
planning time by running 'EXPLAIN ANALYZE' on the same query in a
separately opened connection. We averaged the planning time of 3 successive
'EXPLAIN ANALYZE' runs. sysbench ran on the same machine with varying
numbers of threads using the following command line:

sysbench repro.lua --db-driver=pgsql --pgsql-host=localhost
--pgsql-db=postgres --pgsql-port=? --pgsql-user=? --pgsql-password=?
--report-interval=1 --threads=64 run

The following table shows the results. It is clearly visible that the TPS
flatten out already at 8 parallel streams, while the planning time is
increasing drastically.

Parallel streams | TPS (before) | Planning time (before)
-----------------|--------------|-----------------------
1                |  5,486       | 0.13 ms
2                |  8,098       | 0.22 ms
4                | 15,013       | 0.19 ms
8                | 27,107       | 0.29 ms
16               | 30,938       | 0.43 ms
32               | 26,330       | 1.68 ms
64               | 24,314       | 2.48 ms

We tracked down the root cause of this slowdown to lock contention in
'get_relation_info()'. The index lock of every single index of every single
table used in that query is acquired. We attempted a fix by pre-filtering
out all indexes that anyways cannot be used with a certain query, without
taking the index locks (credits to Luc Vlaming for idea and
implementation). The patch does so by caching the columns present in every
index, inside 'struct Relation', similarly to 'rd_indexlist'. Then, before
opening (= locking) the indexes in 'get_relation_info()', we check if the
index can actually contribute to the query and if not it is discarded right
away. Caching the index info saves considerable work for every query run
subsequently, because less indexes must be inspected and thereby locked.
This way we also save cycles in any code that later on goes over all
relation indexes.

The work-in-progress version of the patch is attached. It is still fairly
rough (e.g. uses a global variable, selects the best index in scans without
restrictions by column count instead of physical column size, is missing
some renaming, etc.), but shows the principle.

The following table shows the TPS, planning time and speed-ups after
applying the patch and rerunning above described benchmark. Now, the
planning time remains roughly constant and TPS roughly doubles each time
the number of parallel streams is doubled. The higher the stream count the
more severe the lock contention is and the more pronounced the gained
speed-up gets. Interestingly, even for a single query stream the speed-up
in planning time is already very significant. This applies also for lower
index counts. For example just with 10 indexes the TPS for a single query
stream goes from 9,159 to 12,558. We can do more measurements if there is
interest in details for a lower number of indexes.

Parallel streams | TPS (after) | Planning time (after) | Speed-up TPS |
Speed-up planning
-----------------|-------------|-----------------------|--------------|------------------
1                |  10,344     | 0.046                 |  1.9x        |
 2.8x
2                |  20,140     | 0.045 ms              |  2.5x        |
 4.9x
4                |  40,349     | 0.047 ms              |  2.7x        |
 4.0x
8                |  80,121     | 0.046 ms              |  3.0x        |
 6.3x
16               | 152,632     | 0.051 ms              |  4.9x        |
 8.4x
32               | 301,359     | 0.052 ms              | 11.4x        |
32.3x
64               | 525,115     | 0.062 ms              | 21.6x        |
40.0x

We are happy to receive your feedback and polish up the patch.

--
David Geier
(ServiceNow)

From 60e300b5cac8f527e61483296df81ab783670ac9 Mon Sep 17 00:00:00 2001
From: David Geier <david.ge...@servicenow.com>
Date: Fri, 15 Jul 2022 11:53:27 +0200
Subject: [PATCH] Index filtering

---
 src/backend/optimizer/path/Makefile           |   3 +-
 src/backend/optimizer/path/index_filtering.c  | 220 ++++++++++++++++++
 src/backend/optimizer/plan/planmain.c         |   7 +
 src/backend/optimizer/util/plancat.c          | 124 ++++++----
 src/backend/utils/cache/relcache.c            |  17 +-
 src/include/optimizer/planner_index_locking.h |  40 ++++
 6 files changed, 361 insertions(+), 50 deletions(-)
 create mode 100644 src/backend/optimizer/path/index_filtering.c
 create mode 100644 src/include/optimizer/planner_index_locking.h

diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f..933c4df8f5 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	joinpath.o \
 	joinrels.o \
 	pathkeys.o \
-	tidpath.o
+	tidpath.o \
+	index_filtering.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/index_filtering.c b/src/backend/optimizer/path/index_filtering.c
new file mode 100644
index 0000000000..b9d05f85f2
--- /dev/null
+++ b/src/backend/optimizer/path/index_filtering.c
@@ -0,0 +1,220 @@
+#include <postgres.h> // NOTE: this include MUST come first
+
+#include <access/htup_details.h>
+#include <access/sysattr.h>
+#include <catalog/pg_am.h>
+#include <catalog/pg_class.h>
+#include <nodes/pathnodes.h>
+#include <optimizer/optimizer.h>
+#include <optimizer/planner_index_locking.h>
+#include <utils/builtins.h>
+#include <utils/catcache.h>
+#include <utils/rel.h>
+#include <utils/syscache.h>
+
+bool FilterIndexes = false;
+
+// similar to e.g. RelationGetPrimaryKeyIndex
+List *s64_RelationGetIndexBitmapList(Relation relation)
+{
+    if (!relation->rd_indexvalid)
+        list_free(RelationGetIndexList(relation));
+
+    return list_copy(relation->rd_indexprs);
+}
+
+IndexBitmapset *s64_RelationBuildIndexBitmapset(HeapTuple htup)
+{
+    Form_pg_index   index         = (Form_pg_index)GETSTRUCT(htup);
+    Bitmapset      *keys          = NULL;
+    IndexBitmapset *bitmap        = NULL;
+    MemoryContext   oldcxt;
+    int             i;
+
+    for (i = 0; i < index->indnatts; ++i)
+        if (index->indkey.values[i] != 0)
+            keys = bms_add_member(keys,
+                                  index->indkey.values[i] - FirstLowInvalidHeapAttributeNumber);
+
+    // Find also all columns referenced by any expression
+    // From RelationGetIndexExpressions and inlined so that we don't need extra locks
+    if (!heap_attisnull(htup, Anum_pg_index_indexprs, NULL))
+    {
+        bool  isNull;
+        Datum exprsDatum =
+                heap_getattr(htup, Anum_pg_index_indexprs, GetPgIndexDescriptor(), &isNull);
+        char *exprsString = TextDatumGetCString(exprsDatum);
+        Node *exprs       = (Node *)stringToNode(exprsString);
+        pull_varattnos(exprs, 1, &keys);
+        pfree(exprsString);
+        pfree(exprs);
+    }
+
+    // Also find any column referenced by any predicate
+    if (!heap_attisnull(htup, Anum_pg_index_indpred, NULL))
+    {
+        bool  isNull;
+        Datum predDatum =
+                heap_getattr(htup, Anum_pg_index_indpred, GetPgIndexDescriptor(), &isNull);
+        char *predString = TextDatumGetCString(predDatum);
+        Node *pred       = (Node *)stringToNode(predString);
+        pull_varattnos(pred, 1, &keys);
+        pfree(predString);
+        pfree(pred);
+    }
+
+    oldcxt                = MemoryContextSwitchTo(CacheMemoryContext);
+    bitmap                = palloc0(sizeof(IndexBitmapset));
+    bitmap->Index         = index->indexrelid;
+    bitmap->Keys          = bms_copy(keys);
+    MemoryContextSwitchTo(oldcxt);
+    bms_free(keys);
+    return bitmap;
+}
+
+// The best index for an index-only scan is the index that:
+// - is the smallest
+// - has all the fields so that it can actually be an index-only scan.
+Oid s64_RelationGetBestIndexForIndexOnly(Relation relation, Bitmapset *required)
+{
+    int       smallest  = INDEX_MAX_KEYS + 1;
+    Oid       bestIndex = InvalidOid;
+    ListCell *lc;
+
+    if (!relation->rd_indexvalid)
+        list_free(RelationGetIndexList(relation));
+
+    foreach (lc, relation->rd_indexprs)
+    {
+        IndexBitmapset *bitmap = (IndexBitmapset *)lfirst(lc);
+
+        if (required == NULL || bms_is_subset(required, bitmap->Keys))
+        {
+            if (bms_num_members(bitmap->Keys) < smallest)
+            {
+                bestIndex = bitmap->Index;
+                smallest  = bms_num_members(bitmap->Keys);
+            }
+        }
+    }
+
+    return bestIndex;
+}
+
+int list_bitmapset_qcmp(const void *p1, const void *p2)
+{
+    return list_bitmapset_cmp(*(ListCell **)p1, *(ListCell **)p2);
+}
+
+int list_bitmapset_cmp(const ListCell *p1, const ListCell *p2)
+{
+    const IndexBitmapset *v1 = (IndexBitmapset *)(lfirst(p1));
+    const IndexBitmapset *v2 = (IndexBitmapset *)(lfirst(p2));
+    if (v1->Index < v2->Index)
+        return -1;
+    if (v1->Index > v2->Index)
+        return 1;
+    return 0;
+}
+
+Bitmapset *s64_RelationUsedClauses(PlannerInfo *root, RelOptInfo *rel)
+{
+    ListCell  *lc1  = NULL;
+    ListCell  *lc2  = NULL;
+    Bitmapset *used = NULL;
+
+    // All columns we filter on
+    foreach (lc1, rel->baserestrictinfo)
+    {
+        RestrictInfo *rinfo = (RestrictInfo *)lfirst(lc1);
+        pull_varattnos((Node *)(rinfo->clause), rel->relid, &used);
+    }
+
+    // All columns explicitly used in a join
+    foreach (lc1, rel->joininfo)
+    {
+        RestrictInfo *rinfo = (RestrictInfo *)lfirst(lc1);
+        pull_varattnos((Node *)(rinfo->clause), rel->relid, &used);
+    }
+
+    // All columns used in any ordering.
+    // Query pathkeys contains any useful ordering for query_planner, so we don't have to figure
+    // out which ordering would be most useful (grouping, window, distinct, sort, etc).
+    foreach (lc1, root->query_pathkeys)
+    {
+        PathKey *pathKey = (PathKey *)lfirst(lc1);
+        foreach (lc2, pathKey->pk_eclass->ec_members)
+        {
+            EquivalenceMember *ecMember = (EquivalenceMember *)lfirst(lc2);
+            if (bms_is_member(rel->relid, ecMember->em_relids))
+                pull_varattnos((Node *)(ecMember->em_expr), rel->relid, &used);
+        }
+    }
+
+    // All useful equivalences there might be, can be used e.g. when you have an index
+    // which can return ordered data for a join indirectly (because the data is somehow
+    // ordered the same way, e.g. because of a PK)
+    foreach (lc1, root->eq_classes)
+    {
+        EquivalenceClass *ec = (EquivalenceClass *)lfirst(lc1);
+        foreach (lc2, ec->ec_members)
+        {
+            EquivalenceMember *ecMember = (EquivalenceMember *)lfirst(lc2);
+            if (bms_is_member(rel->relid, ecMember->em_relids))
+                pull_varattnos((Node *)(ecMember->em_expr), rel->relid, &used);
+        }
+    }
+
+    return used;
+}
+
+Bitmapset *s64_RelationUsedTList(PlannerInfo *root, RelOptInfo *rel)
+{
+    Bitmapset *used = NULL;
+
+    pull_varattnos((Node *)(rel->reltarget->exprs), rel->relid, &used);
+    return used;
+}
+
+// Any index that cannot provide a reduction of the data read is unnecessary.
+// An index can filter out (enough) data when:
+// 1. the columns used in the clauses that the index covers can filter
+//    out enough rows to not hit all pages of the table
+// 2. the index has all fields required to allow an index-only scan,
+//    the data is all-visible, and we want actual fields.
+// 3. we don't want any fields and this is the smallest index.
+//    this is used in a `SELECT COUNT(*) FROM`
+// 4. the index can provide the data ordered on a column that is useful
+bool s64_IsUnnecessaryIndex(PlannerInfo    *root,
+                            IndexBitmapset *bitmap,
+                            Bitmapset      *clauses,
+                            Bitmapset      *all,
+                            Oid             smallestIndex)
+{
+    if (!FilterIndexes)
+        return false;
+
+    // for safety only do this on SELECT statements
+    if (root->parse->commandType != CMD_SELECT)
+        return false;
+
+    // approximation of case 1 and 4. doesn't (yet) take into account visibility fraction
+    if (bms_overlap(clauses, bitmap->Keys))
+        return false;
+
+    if (!bms_is_empty(all))
+    {
+        // case 2.
+        if (bms_is_subset(all, bitmap->Keys))
+            return false;
+    }
+    else
+    {
+        // case 3
+        if (bitmap->Index == smallestIndex)
+            return false;
+    }
+
+    // TestCountIndexesFiltered++;
+    return true;
+}
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 273ac0acf7..370711ca50 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -29,6 +29,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/planmain.h"
+#include "optimizer/planner_index_locking.h"
 
 
 /*
@@ -163,6 +164,7 @@ query_planner(PlannerInfo *root,
 	 * for rels not actively part of the query, for example views.  We don't
 	 * want to make RelOptInfos for them.
 	 */
+	FilterIndexes = true;
 	add_base_rels_to_query(root, (Node *) parse->jointree);
 
 	/*
@@ -213,6 +215,11 @@ query_planner(PlannerInfo *root,
 	 */
 	fix_placeholder_input_needed_levels(root);
 
+	/* add all required indexes for the baserels, so that the following functions have the required indexes:
+	 * reduce_unique_semijoins, remove_useless_joins. This is therefore the latest we can do this. */
+	s64_add_indexes(root);
+	FilterIndexes = false;
+
 	/*
 	 * Remove any useless outer joins.  Ideally this would be done during
 	 * jointree preprocessing, but the necessary information isn't available
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index c5194fdbbf..225b9f9a8e 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -53,6 +53,8 @@
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 
+#include "optimizer/planner_index_locking.h"
+
 /* GUC parameter */
 int			constraint_exclusion = CONSTRAINT_EXCLUSION_PARTITION;
 
@@ -113,10 +115,7 @@ void
 get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 				  RelOptInfo *rel)
 {
-	Index		varno = rel->relid;
 	Relation	relation;
-	bool		hasindex;
-	List	   *indexinfos = NIL;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -150,9 +149,60 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 		estimate_rel_size(relation, rel->attr_widths - rel->min_attr,
 						  &rel->pages, &rel->tuples, &rel->allvisfrac);
 
+	if (!FilterIndexes)
+		s64_add_indexes_for_rel(root, relation->rd_id, inhparent, rel);
+
 	/* Retrieve the parallel_workers reloption, or -1 if not set. */
 	rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1);
 
+	rel->statlist = get_relation_statistics(rel, relation);
+
+	/* Grab foreign-table info using the relcache, while we have it */
+	if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
+	{
+		rel->serverid = GetForeignServerIdByRelId(RelationGetRelid(relation));
+		rel->fdwroutine = GetFdwRoutineForRelation(relation, true);
+	}
+	else
+	{
+		rel->serverid = InvalidOid;
+		rel->fdwroutine = NULL;
+	}
+
+	/* Collect info about relation's foreign keys, if relevant */
+	get_relation_foreign_keys(root, rel, relation, inhparent);
+
+	/* Collect info about functions implemented by the rel's table AM. */
+	if (relation->rd_tableam &&
+		relation->rd_tableam->scan_set_tidrange != NULL &&
+		relation->rd_tableam->scan_getnextslot_tidrange != NULL)
+		rel->amflags |= AMFLAG_HAS_TID_RANGE;
+
+	/*
+	 * Collect info about relation's partitioning scheme, if any. Only
+	 * inheritance parents may be partitioned.
+	 */
+	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
+		set_relation_partition_info(root, rel, relation);
+
+	table_close(relation, NoLock);
+
+	/*
+	 * Allow a plugin to editorialize on the info we obtained from the
+	 * catalogs.  Actions might include altering the assumed relation size,
+	 * removing an index, or adding a hypothetical index to the indexlist.
+	 */
+	if (get_relation_info_hook)
+		(*get_relation_info_hook) (root, relationObjectId, inhparent, rel);
+}
+
+void s64_add_indexes_for_rel(PlannerInfo* root, Oid relationObjectId, bool inhparent, RelOptInfo *rel)
+{
+	Relation relation = table_open(relationObjectId, NoLock);
+	List	   *indexinfos = NIL;
+	Index		varno = rel->relid;
+	bool		hasindex;
+
 	/*
 	 * Make list of indexes.  Ignore indexes on system catalogs if told to.
 	 * Don't bother with indexes for an inheritance parent, either.
@@ -165,11 +215,15 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 	if (hasindex)
 	{
-		List	   *indexoidlist;
+		List	   *index_bitmaps;
 		LOCKMODE	lmode;
 		ListCell   *l;
+		Bitmapset* requiredTList = s64_RelationUsedTList(root, rel);
+		Bitmapset* requiredClauses = s64_RelationUsedClauses(root, rel);
+		Bitmapset* required = bms_union(requiredTList, requiredClauses);
+		Oid smallestIndex = s64_RelationGetBestIndexForIndexOnly(relation, required);
 
-		indexoidlist = RelationGetIndexList(relation);
+		index_bitmaps = s64_RelationGetIndexBitmapList(relation);
 
 		/*
 		 * For each index, we get the same type of lock that the executor will
@@ -181,9 +235,10 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 		 */
 		lmode = root->simple_rte_array[varno]->rellockmode;
 
-		foreach(l, indexoidlist)
+		foreach(l, index_bitmaps)
 		{
-			Oid			indexoid = lfirst_oid(l);
+			IndexBitmapset* bitmap = (IndexBitmapset*) lfirst(l);
+			Oid		indexoid = bitmap->Index;
 			Relation	indexRelation;
 			Form_pg_index index;
 			IndexAmRoutine *amroutine;
@@ -192,6 +247,9 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 						nkeycolumns;
 			int			i;
 
+			if (s64_IsUnnecessaryIndex(root, bitmap, requiredClauses, required, smallestIndex))
+				continue;
+
 			/*
 			 * Extract info from the relation descriptor for the index.
 			 */
@@ -436,51 +494,27 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			indexinfos = lcons(info, indexinfos);
 		}
 
-		list_free(indexoidlist);
+		list_free(index_bitmaps);
+		bms_free(requiredTList);
+		bms_free(requiredClauses);
+		bms_free(required);
 	}
 
 	rel->indexlist = indexinfos;
 
-	rel->statlist = get_relation_statistics(rel, relation);
+	table_close(relation, NoLock);
+}
 
-	/* Grab foreign-table info using the relcache, while we have it */
-	if (relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
-	{
-		rel->serverid = GetForeignServerIdByRelId(RelationGetRelid(relation));
-		rel->fdwroutine = GetFdwRoutineForRelation(relation, true);
-	}
-	else
+void s64_add_indexes(PlannerInfo *root)
+{
+	for (int i = 0; i < root->simple_rel_array_size; ++i)
 	{
-		rel->serverid = InvalidOid;
-		rel->fdwroutine = NULL;
+		RangeTblEntry* rte = root->simple_rte_array[i];
+		RelOptInfo* rel = root->simple_rel_array[i];
+		if (rel != NULL && rte->rtekind == RTE_RELATION && rel->indexlist == NULL)
+			s64_add_indexes_for_rel(root, rte->relid, rte->inh, rel);
 	}
-
-	/* Collect info about relation's foreign keys, if relevant */
-	get_relation_foreign_keys(root, rel, relation, inhparent);
-
-	/* Collect info about functions implemented by the rel's table AM. */
-	if (relation->rd_tableam &&
-		relation->rd_tableam->scan_set_tidrange != NULL &&
-		relation->rd_tableam->scan_getnextslot_tidrange != NULL)
-		rel->amflags |= AMFLAG_HAS_TID_RANGE;
-
-	/*
-	 * Collect info about relation's partitioning scheme, if any. Only
-	 * inheritance parents may be partitioned.
-	 */
-	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-		set_relation_partition_info(root, rel, relation);
-
-	table_close(relation, NoLock);
-
-	/*
-	 * Allow a plugin to editorialize on the info we obtained from the
-	 * catalogs.  Actions might include altering the assumed relation size,
-	 * removing an index, or adding a hypothetical index to the indexlist.
-	 */
-	if (get_relation_info_hook)
-		(*get_relation_info_hook) (root, relationObjectId, inhparent, rel);
-}
+ }
 
 /*
  * get_relation_foreign_keys -
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index dcf56d4790..cc8b1c3cab 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -86,6 +86,7 @@
 #include "utils/resowner_private.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
+#include "optimizer/planner_index_locking.h"
 
 #define RELCACHE_INIT_FILEMAGIC		0x573266	/* version ID value */
 
@@ -290,7 +291,6 @@ static void write_item(const void *data, Size len, FILE *fp);
 static void formrdesc(const char *relationName, Oid relationReltype,
 					  bool isshared, int natts, const FormData_pg_attribute *attrs);
 
-static HeapTuple ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic);
 static Relation AllocateRelationDesc(Form_pg_class relp);
 static void RelationParseRelOptions(Relation relation, HeapTuple tuple);
 static void RelationBuildTupleDesc(Relation relation);
@@ -298,7 +298,6 @@ static Relation RelationBuildDesc(Oid targetRelId, bool insertIt);
 static void RelationInitPhysicalAddr(Relation relation);
 static void load_critical_index(Oid indexoid, Oid heapoid);
 static TupleDesc GetPgClassDescriptor(void);
-static TupleDesc GetPgIndexDescriptor(void);
 static void AttrDefaultFetch(Relation relation, int ndef);
 static int	AttrDefaultCmp(const void *a, const void *b);
 static void CheckConstraintFetch(Relation relation);
@@ -330,7 +329,7 @@ static void unlink_initfile(const char *initfilename, int elevel);
  *		NB: the returned tuple has been copied into palloc'd storage
  *		and must eventually be freed with heap_freetuple.
  */
-static HeapTuple
+HeapTuple
 ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic)
 {
 	HeapTuple	pg_class_tuple;
@@ -4334,7 +4333,7 @@ GetPgClassDescriptor(void)
 	return pgclassdesc;
 }
 
-static TupleDesc
+TupleDesc
 GetPgIndexDescriptor(void)
 {
 	static TupleDesc pgindexdesc = NULL;
@@ -4680,6 +4679,7 @@ RelationGetIndexList(Relation relation)
 	ScanKeyData skey;
 	HeapTuple	htup;
 	List	   *result;
+	List	   *bitmaps;
 	List	   *oldlist;
 	char		replident = relation->rd_rel->relreplident;
 	Oid			pkeyIndex = InvalidOid;
@@ -4697,6 +4697,7 @@ RelationGetIndexList(Relation relation)
 	 * if we get some sort of error partway through.
 	 */
 	result = NIL;
+	bitmaps = NIL;
 
 	/* Prepare to scan pg_index for entries having indrelid = this rel. */
 	ScanKeyInit(&skey,
@@ -4724,6 +4725,10 @@ RelationGetIndexList(Relation relation)
 		/* add index's OID to result list */
 		result = lappend_oid(result, index->indexrelid);
 
+		/* build bitmap and add to bitmaps list, but only for valid indexes */
+		if (index->indisvalid)
+			bitmaps = lappend(bitmaps, s64_RelationBuildIndexBitmapset(htup));
+
 		/*
 		 * Invalid, non-unique, non-immediate or predicate indexes aren't
 		 * interesting for either oid indexes or replication identity indexes,
@@ -4750,10 +4755,14 @@ RelationGetIndexList(Relation relation)
 	/* Sort the result list into OID order, per API spec. */
 	list_sort(result, list_oid_cmp);
 
+	/* sort the bitmaps into OID order, per API spec. */
+	list_sort(bitmaps, list_bitmapset_cmp);
+
 	/* Now save a copy of the completed list in the relcache entry. */
 	oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
 	oldlist = relation->rd_indexlist;
 	relation->rd_indexlist = list_copy(result);
+	relation->rd_indexprs = list_copy(bitmaps);
 	relation->rd_pkindex = pkeyIndex;
 	if (replident == REPLICA_IDENTITY_DEFAULT && OidIsValid(pkeyIndex))
 		relation->rd_replidindex = pkeyIndex;
diff --git a/src/include/optimizer/planner_index_locking.h b/src/include/optimizer/planner_index_locking.h
new file mode 100644
index 0000000000..27adec56d7
--- /dev/null
+++ b/src/include/optimizer/planner_index_locking.h
@@ -0,0 +1,40 @@
+#pragma once
+
+#include <postgres.h>
+
+#include <nodes/plannodes.h>
+#include <optimizer/paths.h>
+#include <utils/rel.h>
+
+extern bool FilterIndexes;
+
+// introduces a bitmap that caches which index covers which columns, so that we can filter the
+// possible indexes we should add to any RelOptInfo* to only be the ones that have any overlap with
+// the requested columns for a relation.
+
+extern TupleDesc GetPgIndexDescriptor(void);
+extern HeapTuple ScanPgRelation(Oid targetRelId, bool indexOK, bool force_non_historic);
+extern List     *s64_RelationGetIndexBitmapList(Relation relation);
+extern Oid       s64_RelationGetBestIndexForIndexOnly(Relation relation, Bitmapset *required);
+extern void      s64_add_indexes(PlannerInfo *root);
+extern void      s64_add_indexes_for_rel(PlannerInfo *root,
+                                         Oid          relationObjectId,
+                                         bool         inhparent,
+                                         RelOptInfo  *rel);
+
+typedef struct IndexBitmapset
+{
+    Oid        Index;
+    Bitmapset *Keys;
+} IndexBitmapset;
+extern IndexBitmapset *s64_RelationBuildIndexBitmapset(HeapTuple htup);
+extern bool            s64_IsUnnecessaryIndex(PlannerInfo    *root,
+                                              IndexBitmapset *bitmap,
+                                              Bitmapset      *clauses,
+                                              Bitmapset      *all,
+                                              Oid             smallestIndex);
+
+extern int        list_bitmapset_qcmp(const void *p1, const void *p2);
+extern int        list_bitmapset_cmp(const ListCell *p1, const ListCell *p2);
+extern Bitmapset *s64_RelationUsedTList(PlannerInfo *root, RelOptInfo *rel);
+extern Bitmapset *s64_RelationUsedClauses(PlannerInfo *root, RelOptInfo *rel);
-- 
2.34.1

schema.sql
Description: application/sql

function thread_init()
  drv = sysbench.sql.driver()
  con = drv:connect()
end

function thread_done()
  con:disconnect()
end

function event()
  con:query('SELECT * FROM synth_table WHERE col05 = 52')
end

[PoC] Reducing planning time on tables with many indexes

Reply via email to