On Fri, Feb 27, 2026 at 3:46 PM Robert Haas <[email protected]> wrote:
> On Thu, Feb 26, 2026 at 8:55 AM Robert Haas <[email protected]> wrote:
> > Thanks, Alex, for the review.
>
> Here's v18. In addition to fixing the problems pointed out by Alex,
> there are a couple of significant changes in this version.
>
>
I have a mind to walk through the readmes and sgmls but its going to be in
chunks. Here's one for the readme for pg_plan_advice with a couple of
preliminary sgml changes.
David J.
diff --git a/contrib/pg_plan_advice/README b/contrib/pg_plan_advice/README
index 0b888fd82f2..9178e2e8c3b 100644
--- a/contrib/pg_plan_advice/README
+++ b/contrib/pg_plan_advice/README
@@ -26,6 +26,8 @@ Advice tags can also be applied to groups of relations; for example,
relation identifier "baz" as well as to the 2-item list containing
"bletch" and "quux".
+XXX: probably should remove "you", the document seems to prefer "the user"
+
Critically, this module knows both how to generate plan advice from an
already-existing plan, and also how to enforce it during future planning
cycles. Everything it does is intended to be "round-trip safe": if you
@@ -37,16 +39,17 @@ led to it being generated in the first place. Note that there is no
intention that these guarantees hold in the presence of intervening DDL;
e.g. if you change the properties of a function so that a subquery is no
longer inlined, or if you drop an index named in the plan advice, the advice
-isn't going to work any more. That's expected.
+isn't going to work any more. That's expected. It should be resilient to
+changes in the statistics, including any CREATE STATISTICS related changes.
This module aims to force the planner to follow any provided advice without
-regard to whether it is appears to be good advice or bad advice. If the
-user provides bad advice, whether derived from a previously-generated plan
-or manually written, they may get a bad plan. We regard this as user error,
+regard to whether it appears to be good advice or bad advice. If you
+provide bad advice, whether derived from a previously-generated plan
+or manually written, you may get a bad plan. We regard this as user error,
not a defect in this module. It seems likely that applying advice
-judiciously and only when truly required to avoid problems will be a more
-successful strategy than applying it with a broad brush, but users are free
-to experiment with whatever strategies they think best.
+conservatively, only when truly required to avoid problems, will be a more
+successful strategy than applying it with a broad brush, but you are free
+to experiment with whatever strategies you think best.
Relation Identifiers
====================
@@ -54,7 +57,7 @@ Relation Identifiers
Uniquely identifying the part of a query to which a certain piece of
advice applies is harder than it sounds. Our basic approach is to use
relation aliases as a starting point, and then disambiguate. There are
-three ways that same relation alias can occur multiple times:
+three ways that the same relation alias can occur multiple times:
1. It can appear in more than one subquery.
@@ -75,13 +78,13 @@ partitioned tables. When the generated occurrence number is 1, we omit
the occurrence number. The partition schema and partition name are included
only for children of partitioned tables. In generated advice, the
partition_schema is always included whenever there is a partition_name,
-but user-written advice may mention the name and omit the schema. The
-plan_name is omitted for the top-level PlannerInfo.
+but user-written advice may omit the schema. The plan_name is omitted
+for the top-level PlannerInfo.
Scan Advice
===========
-For many types of scan, no advice is generated or possible; for instance,
+For many scan types, no advice is generated or possible; for instance,
a subquery is always scanned using a subquery scan. While that scan may be
elided via setrefs processing, this doesn't change the fact that only one
basic approach exists. Hence, scan advice applies mostly to relations, which
@@ -94,8 +97,8 @@ which, in effect, is a degenerate scan of every relation in the collapsed
portion of the join tree. Similarly, it's possible to inject a custom scan
in such a way that it replaces an entire join. If we ever emit advice
for these cases, it would target sets of relation identifiers surrounded
-by parentheses, e.g. SOME_SORT_OF_SCAN(foo (bar baz)) would mean that the
-the given scan type would be used for foo as a single relation and also the
+by parentheses, e.g., SOME_SORT_OF_SCAN(foo (bar baz)) would mean that the
+given scan type would be used for foo as a single relation and also the
combination of bar and baz as a join product. We have no such cases at
present.
@@ -115,14 +118,14 @@ Join Order Advice
The JOIN_ORDER tag specifies the order in which several tables that are
part of the same join problem should be joined. Each subquery (except for
those that are inlined) is a separate join problem. Within a subquery,
-partitionwise joins can create additional, separate join problems. Hence,
+partitionwise joins can create independent join problems. Hence,
queries involving partitionwise joins may use JOIN_ORDER() many times.
We take the canonical join structure to be an outer-deep tree, so
JOIN_ORDER(t1 t2 t3) says that t1 is the driving table and should be joined
first to t2 and then to t3. If the join problem involves additional tables,
they can be joined in any order after the join between t1, t2, and t3 has
-been constructured. Generated join advice always mentions all tables
+been constructed. Generated join advice always mentions all tables
in the join problem, but manually written join advice need not do so.
For trees which are not outer-deep, parentheses can be used. For example,
@@ -135,17 +138,18 @@ For example, if t2 and t3 are being scanned by a single custom scan or foreign
scan, or if a partitionwise join is being performed between those tables, then
it's impossible to say that t2 is the outer table and t3 is the inner table,
or the other way around; it's just undefined. In such cases, we generate
-join advice that uses curly braces, intending to indicate a lack of ordering:
+join advice that uses curly braces, indicating an absence of ordering:
JOIN_ORDER(t1 {t2 t3}) says that the uppermost join should have t1 on the outer
side and some kind of join between t2 and t3 on the inner side, but without
-saying how that join must be performed or anything about which relation should
-appear on which side of the join, or even whether this kind of join has sides.
+saying how that join must be performed (including which relation should
+appear on which side of the join, or even whether this kind of join has sides).
Join Strategy Advice
====================
Tags such as NESTED_LOOP_PLAIN specify the method that should be used to
-perform a certain join. More specifically, NESTED_LOOP_PLAIN(x (y z)) says
+perform a certain join - with the target appearing directly on the inner side
+of the join list first. Thus, NESTED_LOOP_PLAIN(x (y z)) says
that the plan should put the relation whose identifier is "x" on the inner
side of a plain nested loop (one without materialization or memoization)
and that it should also put a join between the relation whose identifier is
@@ -157,22 +161,21 @@ table in the join problem.
Considering that we have both join order advice and join strategy advice,
it might seem natural to say that NESTED_LOOP_PLAIN(x) should be redefined
to mean that x should appear by itself on one side or the other of a nested
-loop, rather than specifically on the inner side, but this definition appears
-useless in practice. It gives the planner too much freedom to do things that
+loop, rather than specifically on the inner side, but this definition is
+problematic in practice. It gives the planner too much freedom to do things that
bear little resemblance to what the user probably had in mind. This makes
only a limited amount of practical difference in the case of a merge join or
unparameterized nested loop, but for a parameterized nested loop or a hash
-join, the two sides are treated very differently and saying that a certain
+join, the two sides are treated very differently, and saying that a certain
relation should be involved in one of those operations without saying which
role it should take isn't saying much.
-This choice of definition implies that join strategy advice also imposes some
+This definition compels join strategy advice to also impose
join order constraints. For example, given a join between foo and bar,
HASH_JOIN(bar) implies that foo is the driving table. Otherwise, it would
-be impossible to put bar beneath the inner side of a Hash Join.
-
-Note that, given this definition, it's reasonable to consider deleting the
-join order advice but applying the join strategy advice. For example,
+be impossible to put bar beneath the inner side of a Hash Join. Thus it's
+reasonable to consider deleting the
+join order advice and only apply the join strategy advice. For example,
consider a star schema with tables fact, dim1, dim2, dim3, dim4, and dim5.
The automatically generated advice might specify JOIN_ORDER(fact dim1 dim3
dim4 dim2 dim5) HASH_JOIN(dim2 dim4) NESTED_LOOP_PLAIN(dim1 dim3 dim5).
@@ -188,11 +191,9 @@ An essential guiding principle is that no inference may made on the basis
of the absence of advice. The user is entitled to remove any portion of the
generated advice which they deem unsuitable or counterproductive and the
result should only be to increase the flexibility afforded to the planner.
-This means that if advice can say that a certain optimization or technique
-should be used, it should also be able to say that the optimization or
-technique should not be used. We should never assume that the absence of an
-instruction to do a certain thing means that it should not be done; all
-instructions must be explicit.
+In other words, advice tags must define whether they encourage or discourage
+certain optimizations or techniques. (NO_GATHER is an example of the latter.
+There is no generic "NOT" syntax, e.g., NOT(HASH_JOIN(dim2 dim4).))
Semijoin Uniqueness
===================
@@ -236,7 +237,7 @@ For example, a user who specifies HASH_JOIN((foo bar)) is explicitly saying
that there should be a hash join with exactly foo and bar on the outer
side of it, but that also implies that foo and bar must be joined to
each other before either of them is joined to anything else. Otherwise,
-the join the user is attempting to constraint won't actually occur in the
+the join the user is attempting to constrain won't actually occur in the
query, which ends up looking like the system has just decided to ignore
the advice altogether.
@@ -253,7 +254,7 @@ rather than later.
We don't offer any control over estimates, only outcomes. It seems like a
good idea to incorporate that ability at some future point, as pg_hint_plan
-does. However, since primary goal of the initial development work is to be
+does. However, since the primary goal of the initial development work is to be
able to induce the planner to recreate a desired plan that worked well in
the past, this has not been included in the initial development effort.
diff --git a/doc/src/sgml/pgplanadvice.sgml b/doc/src/sgml/pgplanadvice.sgml
index 817bafa21c6..c962e565a3a 100644
--- a/doc/src/sgml/pgplanadvice.sgml
+++ b/doc/src/sgml/pgplanadvice.sgml
@@ -10,7 +10,7 @@
<para>
The <filename>pg_plan_advice</filename> module allows key planner decisions
to be described, reproduced, and altered using a special-purpose "plan
- advice" mini-language. It is intended to allow stabilization of plan choices
+ advice" domain specific language (DSL). It is intended to allow stabilization of plan choices
that the user believes to be good, as well as experimentation with plans that
the planner believes to be non-optimal.
</para>
@@ -18,9 +18,9 @@
<para>
Note that, since the planner often makes good decisions, overriding its
judgement can easily backfire. For example, if the distribution of the
- underlying data changes, the planner normally has the option to adjust the
- plan in an attempt to preserve good performance. If the plan advice prevents
- this, a very poor plan may be chosen. It is important to use plan advice
+ underlying data changes, the planner will make different decisions in order to
+ preserve good performance with the new data. If the plan advice overrides
+ these new decisions, a very poor plan may be chosen. It is important to use plan advice
only when the risks of constraining the planner's choices are outweighed by
the benefits.
</para>