Re: pg_plan_advice

David G. Johnston Fri, 27 Feb 2026 17:17:05 -0800

On Fri, Feb 27, 2026 at 3:46 PM Robert Haas <[email protected]> wrote:


> On Thu, Feb 26, 2026 at 8:55 AM Robert Haas <[email protected]> wrote:
> > Thanks, Alex, for the review.
>
> Here's v18. In addition to fixing the problems pointed out by Alex,
> there are a couple of significant changes in this version.
>
>
I have a mind to walk through the readmes and sgmls but its going to be in
chunks.  Here's one for the readme for pg_plan_advice with a couple of
preliminary sgml changes.

David J.

diff --git a/contrib/pg_plan_advice/README b/contrib/pg_plan_advice/README
index 0b888fd82f2..9178e2e8c3b 100644
--- a/contrib/pg_plan_advice/README
+++ b/contrib/pg_plan_advice/README
@@ -26,6 +26,8 @@ Advice tags can also be applied to groups of relations; for example,
 relation identifier "baz" as well as to the 2-item list containing
 "bletch" and "quux".
 
+XXX: probably should remove "you", the document seems to prefer "the user"
+
 Critically, this module knows both how to generate plan advice from an
 already-existing plan, and also how to enforce it during future planning
 cycles. Everything it does is intended to be "round-trip safe": if you
@@ -37,16 +39,17 @@ led to it being generated in the first place. Note that there is no
 intention that these guarantees hold in the presence of intervening DDL;
 e.g. if you change the properties of a function so that a subquery is no
 longer inlined, or if you drop an index named in the plan advice, the advice
-isn't going to work any more. That's expected.
+isn't going to work any more. That's expected. It should be resilient to
+changes in the statistics, including any CREATE STATISTICS related changes.
 
 This module aims to force the planner to follow any provided advice without
-regard to whether it is appears to be good advice or bad advice.  If the
-user provides bad advice, whether derived from a previously-generated plan
-or manually written, they may get a bad plan. We regard this as user error,
+regard to whether it appears to be good advice or bad advice. If you
+provide bad advice, whether derived from a previously-generated plan
+or manually written, you may get a bad plan. We regard this as user error,
 not a defect in this module. It seems likely that applying advice
-judiciously and only when truly required to avoid problems will be a more
-successful strategy than applying it with a broad brush, but users are free
-to experiment with whatever strategies they think best.
+conservatively, only when truly required to avoid problems, will be a more
+successful strategy than applying it with a broad brush, but you are free
+to experiment with whatever strategies you think best.
 
 Relation Identifiers
 ====================
@@ -54,7 +57,7 @@ Relation Identifiers
 Uniquely identifying the part of a query to which a certain piece of
 advice applies is harder than it sounds. Our basic approach is to use
 relation aliases as a starting point, and then disambiguate. There are
-three ways that same relation alias can occur multiple times:
+three ways that the same relation alias can occur multiple times:
 
 1. It can appear in more than one subquery.
 
@@ -75,13 +78,13 @@ partitioned tables.  When the generated occurrence number is 1, we omit
 the occurrence number. The partition schema and partition name are included
 only for children of partitioned tables. In generated advice, the
 partition_schema is always included whenever there is a partition_name,
-but user-written advice may mention the name and omit the schema. The
-plan_name is omitted for the top-level PlannerInfo.
+but user-written advice may omit the schema. The plan_name is omitted
+for the top-level PlannerInfo.
 
 Scan Advice
 ===========
 
-For many types of scan, no advice is generated or possible; for instance,
+For many scan types, no advice is generated or possible; for instance,
 a subquery is always scanned using a subquery scan. While that scan may be
 elided via setrefs processing, this doesn't change the fact that only one
 basic approach exists. Hence, scan advice applies mostly to relations, which
@@ -94,8 +97,8 @@ which, in effect, is a degenerate scan of every relation in the collapsed
 portion of the join tree. Similarly, it's possible to inject a custom scan
 in such a way that it replaces an entire join. If we ever emit advice
 for these cases, it would target sets of relation identifiers surrounded
-by parentheses, e.g. SOME_SORT_OF_SCAN(foo (bar baz)) would mean that the
-the given scan type would be used for foo as a single relation and also the
+by parentheses, e.g., SOME_SORT_OF_SCAN(foo (bar baz)) would mean that the
+given scan type would be used for foo as a single relation and also the
 combination of bar and baz as a join product. We have no such cases at
 present.
 
@@ -115,14 +118,14 @@ Join Order Advice
 The JOIN_ORDER tag specifies the order in which several tables that are
 part of the same join problem should be joined. Each subquery (except for
 those that are inlined) is a separate join problem. Within a subquery,
-partitionwise joins can create additional, separate join problems. Hence,
+partitionwise joins can create independent join problems. Hence,
 queries involving partitionwise joins may use JOIN_ORDER() many times.
 
 We take the canonical join structure to be an outer-deep tree, so
 JOIN_ORDER(t1 t2 t3) says that t1 is the driving table and should be joined
 first to t2 and then to t3. If the join problem involves additional tables,
 they can be joined in any order after the join between t1, t2, and t3 has
-been constructured. Generated join advice always mentions all tables
+been constructed. Generated join advice always mentions all tables
 in the join problem, but manually written join advice need not do so.
 
 For trees which are not outer-deep, parentheses can be used. For example,
@@ -135,17 +138,18 @@ For example, if t2 and t3 are being scanned by a single custom scan or foreign
 scan, or if a partitionwise join is being performed between those tables, then
 it's impossible to say that t2 is the outer table and t3 is the inner table,
 or the other way around; it's just undefined. In such cases, we generate
-join advice that uses curly braces, intending to indicate a lack of ordering:
+join advice that uses curly braces, indicating an absence of ordering:
 JOIN_ORDER(t1 {t2 t3}) says that the uppermost join should have t1 on the outer
 side and some kind of join between t2 and t3 on the inner side, but without
-saying how that join must be performed or anything about which relation should
-appear on which side of the join, or even whether this kind of join has sides.
+saying how that join must be performed (including which relation should
+appear on which side of the join, or even whether this kind of join has sides).
 
 Join Strategy Advice
 ====================
 
 Tags such as NESTED_LOOP_PLAIN specify the method that should be used to
-perform a certain join. More specifically, NESTED_LOOP_PLAIN(x (y z)) says
+perform a certain join - with the target appearing directly on the inner side
+of the join list first. Thus, NESTED_LOOP_PLAIN(x (y z)) says
 that the plan should put the relation whose identifier is "x" on the inner
 side of a plain nested loop (one without materialization or memoization)
 and that it should also put a join between the relation whose identifier is
@@ -157,22 +161,21 @@ table in the join problem.
 Considering that we have both join order advice and join strategy advice,
 it might seem natural to say that NESTED_LOOP_PLAIN(x) should be redefined
 to mean that x should appear by itself on one side or the other of a nested
-loop, rather than specifically on the inner side, but this definition appears
-useless in practice. It gives the planner too much freedom to do things that
+loop, rather than specifically on the inner side, but this definition is
+problematic in practice. It gives the planner too much freedom to do things that
 bear little resemblance to what the user probably had in mind. This makes
 only a limited amount of practical difference in the case of a merge join or
 unparameterized nested loop, but for a parameterized nested loop or a hash
-join, the two sides are treated very differently and saying that a certain
+join, the two sides are treated very differently, and saying that a certain
 relation should be involved in one of those operations without saying which
 role it should take isn't saying much.
 
-This choice of definition implies that join strategy advice also imposes some
+This definition compels join strategy advice to also impose
 join order constraints. For example, given a join between foo and bar,
 HASH_JOIN(bar) implies that foo is the driving table. Otherwise, it would
-be impossible to put bar beneath the inner side of a Hash Join.
-
-Note that, given this definition, it's reasonable to consider deleting the
-join order advice but applying the join strategy advice. For example,
+be impossible to put bar beneath the inner side of a Hash Join. Thus it's
+reasonable to consider deleting the
+join order advice and only apply the join strategy advice. For example,
 consider a star schema with tables fact, dim1, dim2, dim3, dim4, and dim5.
 The automatically generated advice might specify JOIN_ORDER(fact dim1 dim3
 dim4 dim2 dim5) HASH_JOIN(dim2 dim4) NESTED_LOOP_PLAIN(dim1 dim3 dim5).
@@ -188,11 +191,9 @@ An essential guiding principle is that no inference may made on the basis
 of the absence of advice. The user is entitled to remove any portion of the
 generated advice which they deem unsuitable or counterproductive and the
 result should only be to increase the flexibility afforded to the planner.
-This means that if advice can say that a certain optimization or technique
-should be used, it should also be able to say that the optimization or
-technique should not be used. We should never assume that the absence of an
-instruction to do a certain thing means that it should not be done; all
-instructions must be explicit.
+In other words, advice tags must define whether they encourage or discourage
+certain optimizations or techniques. (NO_GATHER is an example of the latter.
+There is no generic "NOT" syntax, e.g., NOT(HASH_JOIN(dim2 dim4).))
 
 Semijoin Uniqueness
 ===================
@@ -236,7 +237,7 @@ For example, a user who specifies HASH_JOIN((foo bar)) is explicitly saying
 that there should be a hash join with exactly foo and bar on the outer
 side of it, but that also implies that foo and bar must be joined to
 each other before either of them is joined to anything else. Otherwise,
-the join the user is attempting to constraint won't actually occur in the
+the join the user is attempting to constrain won't actually occur in the
 query, which ends up looking like the system has just decided to ignore
 the advice altogether.
 
@@ -253,7 +254,7 @@ rather than later.
 
 We don't offer any control over estimates, only outcomes. It seems like a
 good idea to incorporate that ability at some future point, as pg_hint_plan
-does. However, since primary goal of the initial development work is to be
+does. However, since the primary goal of the initial development work is to be
 able to induce the planner to recreate a desired plan that worked well in
 the past, this has not been included in the initial development effort.
 
diff --git a/doc/src/sgml/pgplanadvice.sgml b/doc/src/sgml/pgplanadvice.sgml
index 817bafa21c6..c962e565a3a 100644
--- a/doc/src/sgml/pgplanadvice.sgml
+++ b/doc/src/sgml/pgplanadvice.sgml
@@ -10,7 +10,7 @@
  <para>
   The <filename>pg_plan_advice</filename> module allows key planner decisions
   to be described, reproduced, and altered using a special-purpose "plan
-  advice" mini-language. It is intended to allow stabilization of plan choices
+  advice" domain specific language (DSL). It is intended to allow stabilization of plan choices
   that the user believes to be good, as well as experimentation with plans that
   the planner believes to be non-optimal.
  </para>
@@ -18,9 +18,9 @@
  <para>
   Note that, since the planner often makes good decisions, overriding its
   judgement can easily backfire. For example, if the distribution of the
-  underlying data changes, the planner normally has the option to adjust the
-  plan in an attempt to preserve good performance. If the plan advice prevents
-  this, a very poor plan may be chosen. It is important to use plan advice
+  underlying data changes, the planner will make different decisions in order to
+  preserve good performance with the new data. If the plan advice overrides
+  these new decisions, a very poor plan may be chosen. It is important to use plan advice
   only when the risks of constraining the planner's choices are outweighed by
   the benefits.
  </para>

Re: pg_plan_advice

Reply via email to