Hi,
I would like to discuss the introduction of a hook for evaluating the
selectivity of an expression when searching for an optimal query plan.
This topic has been brought up in various discussions, for example, in [1].
Currently, extensions that interact with the optimiser can only add
their paths without the ability to influence the optimiser's decisions.
As a result, when developing an extension that implements a new type of
statistics (such as a histogram for composite types), utilises knowledge
from previously executed queries, or implements some system of
selectivity hints, we find ourselves writing a considerable amount of
code. To ensure the reliable operation of the extension, this may end up
in developing a separate optimiser or, at the very least, creating a
custom join search (refer to core.c in the pg_hint_plan extension for an
estimation of the amount of code required).
A hook for evaluating selectivity could streamline the development of
methods to improve selectivity evaluation, making it easier to create
new types of statistics and estimation methods (I would like to deal
with join clauses estimation). Considering the limited amount of code
involved and the upcoming code freeze, I propose adding such a hook to
PostgreSQL 18 to assess how it simplifies extension development.
This proposed hook would complement the existing path hooks without
overlapping in functionality. In my experience with implementing
adaptive features in enterprise solutions, I believe that additional
hooks could also be beneficial for estimating the number of groups and
the amount of memory allocated, which is currently based solely on
work_mem. However, these suggestions do not interfere with the current
proposal and could be considered later.
Critique:
In general, a hook for evaluating the number of rows appears to be a
more promising approach. It would allow the extension to access specific
RelOptInfo data, thus providing insights into where the evaluation takes
place within the plan. Consequently, this would enable a deeper
influence on the query plan choice. However, implementing such a hook
might be more invasive, requiring modifications to each cost function.
Additionally, it addresses a slightly different issue and can be
considered separately.
Attached is a patch containing the proposed hook code.
--
regards, Andrei Lepikhov
From 7113a4d0b45a4cf00b78076d55570ade60ff9841 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepi...@gmail.com>
Date: Wed, 5 Mar 2025 09:38:30 +0100
Subject: [PATCH] Introduce selectivity hook.
---
src/backend/optimizer/path/clausesel.c | 5 +++++
src/backend/utils/adt/selfuncs.c | 1 +
src/include/utils/selfuncs.h | 11 +++++++++++
3 files changed, 17 insertions(+)
diff --git a/src/backend/optimizer/path/clausesel.c
b/src/backend/optimizer/path/clausesel.c
index 5d51f97f219..6b5d49d0786 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -128,6 +128,11 @@ clauselist_selectivity_ext(PlannerInfo *root,
ListCell *l;
int listidx;
+ if (clauselist_selectivity_hook)
+ return clauselist_selectivity_hook(root, clauses, varRelid,
jointype,
+
sjinfo, &estimatedclauses,
+
use_extended_stats);
+
/*
* If there's exactly one clause, just go directly to
* clause_selectivity_ext(). None of what we might do below is relevant.
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c2918c9c831..e66346ef6b4 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -147,6 +147,7 @@
/* Hooks for plugins to get control when we ask for stats */
get_relation_stats_hook_type get_relation_stats_hook = NULL;
get_index_stats_hook_type get_index_stats_hook = NULL;
+clauselist_selectivity_hook_type clauselist_selectivity_hook = NULL;
static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index d35939651f3..ce4ba27d183 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -149,6 +149,17 @@ typedef bool (*get_index_stats_hook_type) (PlannerInfo
*root,
VariableStatData *vardata);
extern PGDLLIMPORT get_index_stats_hook_type get_index_stats_hook;
+/* Hooks for plugins to get control when we ask for selectivity estimation */
+typedef Selectivity (*clauselist_selectivity_hook_type) (
+
PlannerInfo *root,
+
List *clauses,
+
int varRelid,
+
JoinType jointype,
+
SpecialJoinInfo *sjinfo,
+
Bitmapset **estimatedclauses,
+
bool use_extended_stats);
+extern PGDLLIMPORT clauselist_selectivity_hook_type
clauselist_selectivity_hook;
+
/* Functions in selfuncs.c */
extern void examine_variable(PlannerInfo *root, Node *node, int varRelid,
--
2.48.1