Hi Hackers,
Here are some documentation patches about inlining SQL-language functions. Postgres has been able to
inline both scalar and set-returning functions since the 9.x days (as long as they are LANGUAGE SQL
and meet a bunch of other conditions). But this was never documented outside of a wiki page[1]. This
is a very significant optimization, especially for SRFs with qual pushdown, but it is largely
unknown. I recently checked five books on advanced Postgres usage, some specifically on query
optimization, and none of them mentioned this feature. I think we should talk about it.
Putting this in our docs also gives us a useful reference point for some support requests that do
similar inlining, but with more user control. We already have a support request to let users inline
single-result SQL functions[2], and I have a patch to do something similar for set-returning SQL
functions[3]. I gave a talk at Postgres Extensions Day in Montreal that ties all this together,[4]
and our docs could do something similar.
The first patch just adds <sect2> elements to break up the Function Optimization section into two
sub-sections: one covering declarative annotations and another covering support functions. No
rewriting is needed.
The second patch adds a new <sect2> explaining how we inline SQL functions: both single-result and
set-returning. Since this happens automatically, it makes a nice progression with the (easy)
declarative annotations and the (hard) support functions.
I wonder if we should have some tests about this behavior also? I'm happy to add those, either as
part of this commitfest entry or something separate.
[1] https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
[2]
https://github.com/postgres/postgres/blob/bd3f59fdb71721921bb0aca7e16d483f72e95779/src/include/nodes/supportnodes.h#L64
[3] https://commitfest.postgresql.org/patch/5083/
[4] https://github.com/pjungwir/inlining-postgres-functions
Yours,
--
Paul ~{:-)
p...@illuminatedcomputing.com
From c32a09320a6320bcc8a681829c03690556760ffe Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" <p...@illuminatedcomputing.com>
Date: Tue, 1 Jul 2025 20:10:18 -0700
Subject: [PATCH v1 1/2] Split Function Optimization section into parts
We already separate declarative function annotations from dynamic
information provided by support functions. We can make these
sub-sections for better clarity and linkability. And this structure will
help keep things accessible when we introduce a new section about
function inlining.
Author: Paul A. Jungwirth <p...@illuminatedcomputing.com>
---
doc/src/sgml/xfunc.sgml | 200 ++++++++++++++++++++++------------------
1 file changed, 108 insertions(+), 92 deletions(-)
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 2d81afce8cb..74740b4e345 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -4097,107 +4097,123 @@ extern PgStat_Kind pgstat_register_kind(PgStat_Kind kind,
knowledge that helps the planner optimize function calls.
</para>
- <para>
- Some basic facts can be supplied by declarative annotations provided in
- the <link linkend="sql-createfunction"><command>CREATE FUNCTION</command></link> command. Most important of
- these is the function's <link linkend="xfunc-volatility">volatility
- category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>,
- or <literal>VOLATILE</literal>); one should always be careful to
- specify this correctly when defining a function.
- The parallel safety property (<literal>PARALLEL
- UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or
- <literal>PARALLEL SAFE</literal>) must also be specified if you hope
- to use the function in parallelized queries.
- It can also be useful to specify the function's estimated execution
- cost, and/or the number of rows a set-returning function is estimated
- to return. However, the declarative way of specifying those two
- facts only allows specifying a constant value, which is often
- inadequate.
- </para>
+ <sect2 id="xfunc-annotations">
+ <title>Function Annotations</title>
+ <indexterm>
+ <primary>function</primary>
+ <secondary>annotations</secondary>
+ </indexterm>
- <para>
- It is also possible to attach a <firstterm>planner support
- function</firstterm> to an SQL-callable function (called
- its <firstterm>target function</firstterm>), and thereby provide
- knowledge about the target function that is too complex to be
- represented declaratively. Planner support functions have to be
- written in C (although their target functions might not be), so this is
- an advanced feature that relatively few people will use.
- </para>
+ <para>
+ Some basic facts can be supplied by declarative annotations provided in
+ the <link linkend="sql-createfunction"><command>CREATE FUNCTION</command></link> command. Most important of
+ these is the function's <link linkend="xfunc-volatility">volatility
+ category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>,
+ or <literal>VOLATILE</literal>); one should always be careful to
+ specify this correctly when defining a function.
+ The parallel safety property (<literal>PARALLEL
+ UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or
+ <literal>PARALLEL SAFE</literal>) must also be specified if you hope
+ to use the function in parallelized queries.
+ It can also be useful to specify the function's estimated execution
+ cost, and/or the number of rows a set-returning function is estimated
+ to return. However, the declarative way of specifying those two
+ facts only allows specifying a constant value, which is often
+ inadequate.
+ </para>
+ </sect2>
- <para>
- A planner support function must have the SQL signature
+ <sect2 id="xfunc-support-functions">
+ <title>Function Support Functions</title>
+ <indexterm>
+ <primary>function</primary>
+ <secondary>support functions</secondary>
+ </indexterm>
+
+ <para>
+ It is also possible to attach a <firstterm>planner support
+ function</firstterm> to an SQL-callable function (called
+ its <firstterm>target function</firstterm>), and thereby provide
+ knowledge about the target function that is too complex to be
+ represented declaratively. Planner support functions have to be
+ written in C (although their target functions might not be), so this is
+ an advanced feature that relatively few people will use.
+ </para>
+
+ <para>
+ A planner support function must have the SQL signature
<programlisting>
supportfn(internal) returns internal
</programlisting>
- It is attached to its target function by specifying
- the <literal>SUPPORT</literal> clause when creating the target function.
- </para>
+ It is attached to its target function by specifying
+ the <literal>SUPPORT</literal> clause when creating the target function.
+ </para>
- <para>
- The details of the API for planner support functions can be found in
- file <filename>src/include/nodes/supportnodes.h</filename> in the
- <productname>PostgreSQL</productname> source code. Here we provide
- just an overview of what planner support functions can do.
- The set of possible requests to a support function is extensible,
- so more things might be possible in future versions.
- </para>
+ <para>
+ The details of the API for planner support functions can be found in
+ file <filename>src/include/nodes/supportnodes.h</filename> in the
+ <productname>PostgreSQL</productname> source code. Here we provide
+ just an overview of what planner support functions can do.
+ The set of possible requests to a support function is extensible,
+ so more things might be possible in future versions.
+ </para>
- <para>
- Some function calls can be simplified during planning based on
- properties specific to the function. For example,
- <literal>int4mul(n, 1)</literal> could be simplified to
- just <literal>n</literal>. This type of transformation can be
- performed by a planner support function, by having it implement
- the <literal>SupportRequestSimplify</literal> request type.
- The support function will be called for each instance of its target
- function found in a query parse tree. If it finds that the particular
- call can be simplified into some other form, it can build and return a
- parse tree representing that expression. This will automatically work
- for operators based on the function, too — in the example just
- given, <literal>n * 1</literal> would also be simplified to
- <literal>n</literal>.
- (But note that this is just an example; this particular
- optimization is not actually performed by
- standard <productname>PostgreSQL</productname>.)
- We make no guarantee that <productname>PostgreSQL</productname> will
- never call the target function in cases that the support function could
- simplify. Ensure rigorous equivalence between the simplified
- expression and an actual execution of the target function.
- </para>
+ <para>
+ Some function calls can be simplified during planning based on
+ properties specific to the function. For example,
+ <literal>int4mul(n, 1)</literal> could be simplified to
+ just <literal>n</literal>. This type of transformation can be
+ performed by a planner support function, by having it implement
+ the <literal>SupportRequestSimplify</literal> request type.
+ The support function will be called for each instance of its target
+ function found in a query parse tree. If it finds that the particular
+ call can be simplified into some other form, it can build and return a
+ parse tree representing that expression. This will automatically work
+ for operators based on the function, too — in the example just
+ given, <literal>n * 1</literal> would also be simplified to
+ <literal>n</literal>.
+ (But note that this is just an example; this particular
+ optimization is not actually performed by
+ standard <productname>PostgreSQL</productname>.)
+ We make no guarantee that <productname>PostgreSQL</productname> will
+ never call the target function in cases that the support function could
+ simplify. Ensure rigorous equivalence between the simplified
+ expression and an actual execution of the target function.
+ </para>
- <para>
- For target functions that return <type>boolean</type>, it is often useful to estimate
- the fraction of rows that will be selected by a <literal>WHERE</literal> clause using that
- function. This can be done by a support function that implements
- the <literal>SupportRequestSelectivity</literal> request type.
- </para>
+ <para>
+ For target functions that return <type>boolean</type>, it is often useful to estimate
+ the fraction of rows that will be selected by a <literal>WHERE</literal> clause using that
+ function. This can be done by a support function that implements
+ the <literal>SupportRequestSelectivity</literal> request type.
+ </para>
- <para>
- If the target function's run time is highly dependent on its inputs,
- it may be useful to provide a non-constant cost estimate for it.
- This can be done by a support function that implements
- the <literal>SupportRequestCost</literal> request type.
- </para>
+ <para>
+ If the target function's run time is highly dependent on its inputs,
+ it may be useful to provide a non-constant cost estimate for it.
+ This can be done by a support function that implements
+ the <literal>SupportRequestCost</literal> request type.
+ </para>
- <para>
- For target functions that return sets, it is often useful to provide
- a non-constant estimate for the number of rows that will be returned.
- This can be done by a support function that implements
- the <literal>SupportRequestRows</literal> request type.
- </para>
+ <para>
+ For target functions that return sets, it is often useful to provide
+ a non-constant estimate for the number of rows that will be returned.
+ This can be done by a support function that implements
+ the <literal>SupportRequestRows</literal> request type.
+ </para>
- <para>
- For target functions that return <type>boolean</type>, it may be possible to
- convert a function call appearing in <literal>WHERE</literal> into an indexable operator
- clause or clauses. The converted clauses might be exactly equivalent
- to the function's condition, or they could be somewhat weaker (that is,
- they might accept some values that the function condition does not).
- In the latter case the index condition is said to
- be <firstterm>lossy</firstterm>; it can still be used to scan an index,
- but the function call will have to be executed for each row returned by
- the index to see if it really passes the <literal>WHERE</literal> condition or not.
- To create such conditions, the support function must implement
- the <literal>SupportRequestIndexCondition</literal> request type.
- </para>
+ <para>
+ For target functions that return <type>boolean</type>, it may be possible to
+ convert a function call appearing in <literal>WHERE</literal> into an indexable operator
+ clause or clauses. The converted clauses might be exactly equivalent
+ to the function's condition, or they could be somewhat weaker (that is,
+ they might accept some values that the function condition does not).
+ In the latter case the index condition is said to
+ be <firstterm>lossy</firstterm>; it can still be used to scan an index,
+ but the function call will have to be executed for each row returned by
+ the index to see if it really passes the <literal>WHERE</literal> condition or not.
+ To create such conditions, the support function must implement
+ the <literal>SupportRequestIndexCondition</literal> request type.
+ </para>
+ </sect2>
</sect1>
--
2.45.0
From 79d95ce36d9712110c7c84d2e964e1a7b3360d64 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" <p...@illuminatedcomputing.com>
Date: Tue, 1 Jul 2025 20:15:55 -0700
Subject: [PATCH v1 2/2] Document inlining SQL-language functions
Both single-result functions and set-returning functions can be inlined
(since the 9.x days), but this has never been documented outside of a
wiki page: https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
This useful optimization seems largely unknown, even to many books about
Postgres query optimization, so we should include it in our
documentation.
Author: Paul A. Jungwirth <p...@illuminatedcomputing.com>
---
doc/src/sgml/xfunc.sgml | 52 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 51 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 74740b4e345..14a632b72d7 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -4097,6 +4097,56 @@ extern PgStat_Kind pgstat_register_kind(PgStat_Kind kind,
knowledge that helps the planner optimize function calls.
</para>
+ <sect2 id="xfunc-inlining">
+ <title>Function Inlining</title>
+ <indexterm>
+ <primary>function</primary>
+ <secondary>inlining</secondary>
+ </indexterm>
+
+ <para>
+ Even with no extra information, the planner may be able to inline the function
+ into the calling query. The rules vary depending on whether the function returns
+ a single result or is a <link linkend="queries-tablefunctions">set-returning function</link>.
+ But in all cases the function must be implemented in SQL (not PL/pgSQL).
+ It must not be <literal>SECURITY DEFINER</literal>.
+ And if an extension has hooked function entry/exit,
+ then inlining must be skipped.
+ </para>
+
+ <para>
+ For single-result functions, the function body must be a single
+ <literal>SELECT <replaceable>expression</replaceable></literal> statement
+ returning a single column.
+ It must not return a <literal>RECORD</literal>.
+ It must return a type that matches the function declaration.
+ It cannot recurse. It must not include CTEs, a <literal>FROM</literal> clause,
+ references to tables or table-like objects, <literal>DISTINCT</literal>,
+ <literal>GROUP BY</literal>, <literal>HAVING</literal>,
+ aggregate functions, window functions,
+ <literal>ORDER BY</literal>, <literal>LIMIT</literal>, <literal>OFFSET</literal>,
+ <literal>UNION</literal>, <literal>INTERSECT</literal>, or <literal>EXCEPT</literal>.
+ Its arguments, if used more than once in its body, cannot include <literal>VOLATILE</literal> functions.
+ The hypothetical inlined expression must be no more volatile than the original function
+ (so an <literal>IMMUTABLE</literal> function must inline to an <literal>IMMUTABLE</literal>
+ expression, and a <literal>STABLE</literal> function must inline to <literal>STABLE</literal> or <literal>IMMUTABLE</literal>).
+ If the original function was <literal>STRICT</literal>, then any called functions must be <literal>STRICT</literal>.
+ For more control, see <link linkend="xfunc-support-request-simplify"><literal>SupportRequestSimplify</literal></link>.
+ </para>
+
+ <para>
+ For set-returning functions, inlining lets the planner merge the query into the
+ outer query, enabling optimizations like qual pushdown, constant folding, etc.
+ The function body must be a single <literal>SELECT</literal> statement.
+ It must be declared <literal>STABLE</literal> or <literal>IMMUTABLE</literal>.
+ It must not be <literal>STRICT</literal>.
+ In addition its arguments may not include volatile function calls or
+ sub-queries. The function must be called from the <literal>FROM</literal> clause,
+ not the <literal>SELECT</literal> clause, nor with <literal>ORDINALITY</literal> or
+ <literal>ROWS FROM</literal>.
+ </para>
+ </sect2>
+
<sect2 id="xfunc-annotations">
<title>Function Annotations</title>
<indexterm>
@@ -4158,7 +4208,7 @@ supportfn(internal) returns internal
so more things might be possible in future versions.
</para>
- <para>
+ <para id="xfunc-support-request-simplify">
Some function calls can be simplified during planning based on
properties specific to the function. For example,
<literal>int4mul(n, 1)</literal> could be simplified to
--
2.45.0