Hi,

I would like to discuss the introduction of a hook for evaluating the selectivity of an expression when searching for an optimal query plan. This topic has been brought up in various discussions, for example, in [1].

Currently, extensions that interact with the optimiser can only add their paths without the ability to influence the optimiser's decisions. As a result, when developing an extension that implements a new type of statistics (such as a histogram for composite types), utilises knowledge from previously executed queries, or implements some system of selectivity hints, we find ourselves writing a considerable amount of code. To ensure the reliable operation of the extension, this may end up in developing a separate optimiser or, at the very least, creating a custom join search (refer to core.c in the pg_hint_plan extension for an estimation of the amount of code required).

A hook for evaluating selectivity could streamline the development of methods to improve selectivity evaluation, making it easier to create new types of statistics and estimation methods (I would like to deal with join clauses estimation). Considering the limited amount of code involved and the upcoming code freeze, I propose adding such a hook to PostgreSQL 18 to assess how it simplifies extension development.

This proposed hook would complement the existing path hooks without overlapping in functionality. In my experience with implementing adaptive features in enterprise solutions, I believe that additional hooks could also be beneficial for estimating the number of groups and the amount of memory allocated, which is currently based solely on work_mem. However, these suggestions do not interfere with the current proposal and could be considered later.

Critique:
In general, a hook for evaluating the number of rows appears to be a more promising approach. It would allow the extension to access specific RelOptInfo data, thus providing insights into where the evaluation takes place within the plan. Consequently, this would enable a deeper influence on the query plan choice. However, implementing such a hook might be more invasive, requiring modifications to each cost function. Additionally, it addresses a slightly different issue and can be considered separately.

Attached is a patch containing the proposed hook code.

--
regards, Andrei Lepikhov
From 7113a4d0b45a4cf00b78076d55570ade60ff9841 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" <lepi...@gmail.com>
Date: Wed, 5 Mar 2025 09:38:30 +0100
Subject: [PATCH] Introduce selectivity hook.

---
 src/backend/optimizer/path/clausesel.c |  5 +++++
 src/backend/utils/adt/selfuncs.c       |  1 +
 src/include/utils/selfuncs.h           | 11 +++++++++++
 3 files changed, 17 insertions(+)

diff --git a/src/backend/optimizer/path/clausesel.c 
b/src/backend/optimizer/path/clausesel.c
index 5d51f97f219..6b5d49d0786 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -128,6 +128,11 @@ clauselist_selectivity_ext(PlannerInfo *root,
        ListCell   *l;
        int                     listidx;
 
+       if (clauselist_selectivity_hook)
+               return clauselist_selectivity_hook(root, clauses, varRelid, 
jointype,
+                                                                               
   sjinfo, &estimatedclauses,
+                                                                               
   use_extended_stats);
+
        /*
         * If there's exactly one clause, just go directly to
         * clause_selectivity_ext(). None of what we might do below is relevant.
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c2918c9c831..e66346ef6b4 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -147,6 +147,7 @@
 /* Hooks for plugins to get control when we ask for stats */
 get_relation_stats_hook_type get_relation_stats_hook = NULL;
 get_index_stats_hook_type get_index_stats_hook = NULL;
+clauselist_selectivity_hook_type clauselist_selectivity_hook = NULL;
 
 static double eqsel_internal(PG_FUNCTION_ARGS, bool negate);
 static double eqjoinsel_inner(Oid opfuncoid, Oid collation,
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index d35939651f3..ce4ba27d183 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -149,6 +149,17 @@ typedef bool (*get_index_stats_hook_type) (PlannerInfo 
*root,
                                                                                
   VariableStatData *vardata);
 extern PGDLLIMPORT get_index_stats_hook_type get_index_stats_hook;
 
+/* Hooks for plugins to get control when we ask for selectivity estimation */
+typedef Selectivity (*clauselist_selectivity_hook_type) (
+                                                                               
                PlannerInfo *root,
+                                                                               
                List *clauses,
+                                                                               
                int varRelid,
+                                                                               
                JoinType jointype,
+                                                                               
                SpecialJoinInfo *sjinfo,
+                                                                               
                Bitmapset **estimatedclauses,
+                                                                               
                bool use_extended_stats);
+extern PGDLLIMPORT clauselist_selectivity_hook_type 
clauselist_selectivity_hook;
+
 /* Functions in selfuncs.c */
 
 extern void examine_variable(PlannerInfo *root, Node *node, int varRelid,
-- 
2.48.1

Reply via email to