On 09/05/2026 12:51, Andrei Lepikhov wrote:
> On 09/02/2026 23:34, Andrei Lepikhov wrote:
>> On 9/2/26 21:16, Tom Lane wrote:
>>> What I'm wondering about is that join_collapse_limit and
>>> from_collapse_limit were invented more than two decades ago, but
>>> we've not touched their default values since then. Machines are a
>>> lot faster since 2004, and we've probably achieved some net speedups
>>> in the planner logic as well. Could we alleviate this concern by
>>> raising those defaults, and if so, what are reasonable values in 2026?
>>
>> As I see, people never use the default settings now. The case that triggered
>> this topic could work well with a join collapse limit around 40 joins (GEQO
>> started at 14). But a specific setting always depends on how much time people
>> want to spend on planning. So, I don't think a change of default settings is
>> needed.
> After looking into more cases, I realized the main issue is actually
> something else.
> Looking at pull_up_sublinks_qual_recurse, I see that we have information to
> skip
> extra work if the join_collapse_limit is high enough.
>
And here is the patch.
The main idea is that after a transformation (EXISTS or ANY), we should not just
initialise the larg of the JoinExpr with the incoming join tree. Instead, we
need to look for minimal subtree. Since we already have relids referenced in the
subselect, we can traverse this subtree to find the smallest safe join tree that
contains these relids. Then, we initialise the larg of our SEMI JOIN JoinExpr
with this subtree and update the larg of the upper JoinExpr to point to our
SemiJoin.
Such behaviour is quite close to the not-pulled-up variant when a subselect is
used as a filter and pushed down to the minimum level needed to evaluate this
SubLink. So, delay the influence of join_collapse_limit as much as possible.
The patch provided contains implementation of the idea - the key function is
insert_pulled_up_sublink_join. It is a little polished with Claude AI - it added
tests to detect any issues during rebase onto master and documentation.
I'll CC Richard, since he often challenges the transformation machinery and may
be interested in helping to stabilise the user experience after the upgrade.
--
regards, Andrei Lepikhov,
pgEdge
From 402ded29a96861bc03bae06a3c2b5db832eea3db Mon Sep 17 00:00:00 2001
From: Claude <[email protected]>
Date: Wed, 6 May 2026 11:42:58 +0000
Subject: [PATCH v0] Push pulled-up SEMI/ANTI joins next to their referenced
rels
When pull_up_sublinks converts a WHERE-level EXISTS/IN sublink into a
SEMI/ANTI JoinExpr, today's code puts the new join in-place. Once
join_collapse_limit prevents
deconstruct_jointree from flattening that subtree, the SEMI/ANTI is
stranded above many siblings, forcing the existence-test rel to be
joined last regardless of selectivity.
Splice the pulled-up join at the deepest slot whose subtree already
contains every relid referenced by the new join's outer-side quals.
The descent only enters non-nullable sides of enclosing outer joins,
matching the available_rels invariant that pull_up_sublinks already
relies on, so it never pushes a SEMI filter into the nullable side
of a LEFT/RIGHT/FULL join.
Adjusted regression expected outputs and added a focused subselect.sql
case exercising the proposal's archetype (T1..Tn + EXISTS(Tx
referencing T1) under a low join_collapse_limit), plus a negative
case verifying we do not push past a LEFT JOIN.
---
.../postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/config.sgml | 33 ++
src/backend/optimizer/README | 48 +++
src/backend/optimizer/prep/prepjointree.c | 353 +++++++++++++++++-
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/optimizer/planmain.h | 1 +
src/test/regress/expected/partition_join.out | 36 +-
src/test/regress/expected/subselect.out | 135 +++++++
src/test/regress/sql/subselect.sql | 76 ++++
10 files changed, 653 insertions(+), 39 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out
b/contrib/postgres_fdw/expected/postgres_fdw.out
index aaffcf31271..183aabd5b99 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4814,7 +4814,7 @@ SELECT ft2.*, ft4.* FROM ft2 INNER JOIN ft4 ON ft2.c2 =
ft4.c1
Foreign Scan
Output: ft2.c1, ft2.c2, ft2.c3, ft2.c4, ft2.c5, ft2.c6, ft2.c7, ft2.c8,
ft4.c1, ft4.c2, ft4.c3
Relations: ((public.ft2) INNER JOIN (public.ft4)) SEMI JOIN (public.ft5)
- Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7,
r1.c8, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2 ON
(((r1.c2 = r2.c1)) AND ((r1."C 1" > 900)))) WHERE EXISTS (SELECT NULL FROM "S
1"."T 4" r4 WHERE ((r1.c2 = r4.c1))) ORDER BY r1."C 1" ASC NULLS LAST
+ Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7,
r1.c8, r2.c1, r2.c2, r2.c3 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 3" r2 ON
(((r1.c2 = r2.c1)) AND ((r1."C 1" > 900)))) WHERE EXISTS (SELECT NULL FROM "S
1"."T 4" r4 WHERE ((r2.c1 = r4.c1))) ORDER BY r1."C 1" ASC NULLS LAST
(4 rows)
SELECT ft2.*, ft4.* FROM ft2 INNER JOIN ft4 ON ft2.c2 = ft4.c1
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 73cc0412330..a34e078c7ca 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6124,6 +6124,29 @@ ANY <replaceable
class="parameter">num_sync</replaceable> ( <replaceable class="
</listitem>
</varlistentry>
+ <varlistentry id="guc-enable-pulled-up-semi-relocation"
xreflabel="enable_pulled_up_semi_relocation">
+ <term><varname>enable_pulled_up_semi_relocation</varname>
(<type>boolean</type>)
+ <indexterm>
+ <primary><varname>enable_pulled_up_semi_relocation</varname>
configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ When the planner pulls up an <literal>EXISTS</literal> or
+ <literal>IN</literal> sublink in <literal>WHERE</literal> into a
+ SEMI- or ANTI-join, this setting controls whether the new join is
+ spliced into the join tree adjacent to the relation it references
+ (<literal>on</literal>, the default) or wrapped around the entire
+ scope at the original placement (<literal>off</literal>). The
+ deeper placement helps when <xref linkend="guc-join-collapse-limit"/>
+ prevents the surrounding join list from being flattened, but adds a
+ small amount of planning work proportional to the join tree size.
+ Disabling it provides a runtime fallback in the unlikely event that
+ the deeper placement causes a plan regression on a particular query.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-enable-self-join-elimination"
xreflabel="enable_self_join_elimination">
<term><varname>enable_self_join_elimination</varname>
(<type>boolean</type>)
<indexterm>
@@ -6855,6 +6878,16 @@ SELECT * FROM parent WHERE key = 2400;
may trigger use of the GEQO planner, resulting in non-optimal
plans. See <xref linkend="runtime-config-query-geqo"/>.
</para>
+
+ <para>
+ This limit governs only joins that the user wrote explicitly;
+ SEMI- and ANTI-joins synthesised from pulled-up
<literal>EXISTS</literal>
+ or <literal>IN</literal> sublinks in <literal>WHERE</literal> are
+ placed at the deepest legal point in the join tree whose subtree
+ already covers every relation the new join's qualifications
+ reference. This behaviour can be disabled with
+ <xref linkend="guc-enable-pulled-up-semi-relocation"/>.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 6c35baceedb..aca2cd70699 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -298,6 +298,54 @@ search has runtime exponential in the number of FROM-items
considered.
Therefore, we don't merge FROM-lists if the result would have too many
FROM-items in one list.
+Pulling Up Sublinks
+-------------------
+
+A separate pass, pull_up_sublinks() in prepjointree.c, runs before
+pull_up_subqueries() and converts WHERE-level EXISTS / IN (ANY) sublinks
+into SEMI/ANTI JoinExprs whenever the surrounding context permits. When
+it does so, the new JoinExpr is inserted into the jointree at the deepest
+jtlink beneath the current one whose subtree already contains every
+relation the new join's outer-side qualifications reference -- both the
+testexpr / lifted EXISTS qual and any LATERAL references hidden inside
+the rarg subquery. Inserting "next to the referenced rel" lets the SEMI
+survive join_collapse_limit flattening as a tightly-coupled pair with
+the rel it references, so the planner can consider join orders like
+(T1 SEMI Tx) JOIN T2 JOIN ... for queries shaped like
+
+ SELECT * FROM T1 JOIN T2 ON ... JOIN ... JOIN Tn ON ...
+ WHERE EXISTS (SELECT 1 FROM Tx WHERE clause(Tx, T1));
+
+instead of always inserting the SEMI at the outermost legal point and
+forcing Tx to be joined last whenever the join list exceeds
+join_collapse_limit.
+
+The descent only enters non-nullable sides of enclosing outer joins
+(LEFT.larg, RIGHT.rarg, both sides of INNER, larg-only of any
+already-inserted SEMI/ANTI in the path), mirroring the available_rels
+discipline pull_up_sublinks already maintains for the testexpr. Every
+rel in the chosen subtree is therefore non-nullable in the surrounding
+qual's vantage point, so pre-filtering with the SEMI cannot drop outer
+rows that the original WHERE clause would have preserved. FULL JOINs
+and the rarg of any LEFT/RIGHT JOIN are never descended into.
+
+Cost: the descent calls get_relids_in_jointree() at each level it inspects,
+which is O(N) per call where N is the size of the visited subtree. In the
+worst case (a left-deep join tree of N rels), this sums to O(N^2) bitmapset
+operations per pulled-up sublink. For typical queries (N <= 30) the cost
+is well under a hundred microseconds and dominated by the surrounding
+planner work; for very large generated queries it can become measurable.
+The runtime kill switch enable_pulled_up_semi_relocation falls back to the
+historical "wrap at the top" placement and bypasses the descent entirely.
+
+Extensions that walk parse->jointree between pull_up_sublinks and
+deconstruct_jointree -- notably planner-shaping extensions like
+pg_hint_plan and AQO that observe or override join order from the
+parse-tree shape -- will see this deeper insertion topology rather than
+the historical "always at the top" placement. Extensions that cache
+join-order decisions across query rewrites should treat the topology as
+freshly computed per query, not stable across pull-up passes.
+
Vars and PlaceHolderVars
------------------------
diff --git a/src/backend/optimizer/prep/prepjointree.c
b/src/backend/optimizer/prep/prepjointree.c
index 4424fdbe906..a9761d7e148 100644
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -37,6 +37,7 @@
#include "optimizer/optimizer.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
+#include "optimizer/planmain.h"
#include "optimizer/prep.h"
#include "optimizer/subselect.h"
#include "optimizer/tlist.h"
@@ -105,6 +106,9 @@ typedef struct reduce_outer_joins_partial_state
Relids unreduced_side; /* relids in its still-nullable side */
} reduce_outer_joins_partial_state;
+/* GUC: relocate pulled-up SEMI/ANTI joins next to their referenced rels */
+bool enable_pulled_up_semi_relocation = true;
+
static Query *expand_virtual_generated_columns(PlannerInfo *root, Query *parse,
RangeTblEntry *rte, int rt_index,
Relation relation);
@@ -113,6 +117,10 @@ static Node *pull_up_sublinks_jointree_recurse(PlannerInfo
*root, Node *jtnode,
static Node *pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,
Node **jtlink1, Relids available_rels1,
Node **jtlink2, Relids available_rels2);
+static Relids insert_pulled_up_sublink_join(PlannerInfo *root, JoinExpr *j,
+
Node **jtlink, Relids available_rels);
+static Relids collect_sublink_rarg_lateral_rels(PlannerInfo *root, Node *rarg);
+static Node **find_sublink_jtlink(Node **slot, Relids target_rels);
static Node *pull_up_subqueries_recurse(PlannerInfo *root, Node *jtnode,
JoinExpr *lowest_outer_join,
AppendRelInfo *containing_appendrel);
@@ -835,6 +843,303 @@ pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node
*jtnode,
return jtnode;
}
+/*
+ * Insert a SEMI/ANTI JoinExpr produced by convert_*_sublink_to_join into
+ * the jointree, picking the deepest legal jtlink under *jtlink whose
+ * subtree already contains every relation the new join's qualifications
+ * reference on the outer side -- including any LATERAL references hidden
+ * inside the pulled-up rarg subquery. Falls back to inserting at *jtlink
+ * itself when no deeper jtlink is safe; that matches the behaviour
+ * predating this routine and is therefore always correct.
+ *
+ * The descent only enters non-nullable sides of enclosing outer joins,
+ * mirroring the "available_rels" discipline used elsewhere in
+ * pull_up_sublinks: every rel in the chosen subtree is non-nullable in the
+ * surrounding qual's scope, so pre-filtering with the SEMI cannot drop
+ * outer rows that the original WHERE clause would have preserved.
+ *
+ * Returns the relids of the (post-insertion) subtree now occupying j->larg.
+ * Callers that recursively process j->quals must use that set in place of
+ * the caller's wider available_rels, since after a deeper insertion j->larg
+ * may cover only a strict subset of the original scope. The four NOT-
+ * branch call sites discard this return value: they recurse only on
+ * j->rarg, never on j->larg, so a narrower scope cannot affect them.
+ */
+static Relids
+insert_pulled_up_sublink_join(PlannerInfo *root, JoinExpr *j,
+ Node **jtlink, Relids
available_rels)
+{
+ Relids quals_varnos;
+ Relids lateral_rels;
+ Relids target_rels;
+ Node **target_jtlink;
+
+ /*
+ * Operational kill switch. When disabled, fall back to the historical
+ * "splice in place" behaviour: the SEMI/ANTI is wrapped around the
entire
+ * subtree at *jtlink, which matches what pull_up_sublinks did before
this
+ * routine existed. Useful for diagnosing planning-time regressions or
+ * plan-shape changes attributable to the deeper insertion topology.
+ *
+ * Return available_rels rather than recomputing get_relids_in_jointree
on
+ * j->larg. Since j->larg is now the entire subtree the caller knew as
+ * *jtlink, its relids match available_rels by construction -- and the
+ * historical code path passed available_rels unchanged into the
recursive
+ * qual scan, so reusing it here is strictly equivalent. Skipping the
walk
+ * also avoids a redundant O(N) bitmapset allocation per pulled-up
sublink
+ * on the kill-switch path.
+ */
+ if (!enable_pulled_up_semi_relocation)
+ {
+ j->larg = *jtlink;
+ *jtlink = (Node *) j;
+ return available_rels;
+ }
+
+ /*
+ * Outer-side relids the new SEMI/ANTI references. We must consider
both
+ * j->quals (the explicit testexpr / lifted EXISTS qual) and any LATERAL
+ * references buried inside the rarg subquery; both anchor the SEMI to
+ * outer rels and the chosen insertion jtlink must cover every one of
them.
+ * Intersecting with available_rels drops the rarg's freshly-added RTE
+ * and any varnos from outer query levels.
+ *
+ * lateral_rels is NULL in the overwhelmingly common case (no LATERAL
ref
+ * in the rarg). Splitting the two cases lets us call bms_intersect
+ * directly there instead of allocating a transient bms_union result.
+ */
+ quals_varnos = pull_varnos(root, j->quals);
+ lateral_rels = collect_sublink_rarg_lateral_rels(root, j->rarg);
+ if (lateral_rels == NULL)
+ target_rels = bms_intersect(quals_varnos, available_rels);
+ else
+ {
+ target_rels = bms_union(quals_varnos, lateral_rels);
+ target_rels = bms_int_members(target_rels, available_rels);
+ }
+
+ /*
+ * After the intersection, target_rels is a subset of available_rels by
+ * construction. This invariant is load-bearing for the descent: the
+ * find_sublink_jtlink logic only descends through non-nullable
+ * sides of enclosing outer joins, but it relies on the caller having
+ * already proven that every member of target_rels is itself
+ * non-nullable in the surrounding scope -- which is exactly what
+ * available_rels guarantees.
+ */
+ Assert(bms_is_subset(target_rels, available_rels));
+
+ /*
+ * target_rels cannot be empty here. convert_ANY_sublink_to_join and
+ * convert_EXISTS_sublink_to_join both require their respective
outer-rel
+ * reference set (upper_varnos / level-1 vars of the lifted WHERE) to be
+ * non-empty and a subset of available_rels before they will pull up a
+ * sublink, so by construction quals_varnos has at least one member in
+ * common with available_rels. Assert that here -- if a future change
+ * loosens those gates, the descent below would silently degrade to the
+ * top-of-scope placement and we want to know about it.
+ */
+ Assert(!bms_is_empty(target_rels));
+
+ target_jtlink = find_sublink_jtlink(jtlink, target_rels);
+
+ j->larg = *target_jtlink;
+ *target_jtlink = (Node *) j;
+
+ /*
+ * Post-condition: the chosen subtree, now occupying j->larg, must cover
+ * every relid the SEMI/ANTI's quals reference within the surrounding
+ * scope. A descent that returned the wrong slot would silently produce
+ * a join whose qual contains free vars; catch that here under cassert.
+ */
+#ifdef USE_ASSERT_CHECKING
+ {
+ Relids larg_check = get_relids_in_jointree(j->larg,
true, false);
+
+ Assert(bms_is_subset(target_rels, larg_check));
+ bms_free(larg_check);
+ }
+#endif
+
+ bms_free(quals_varnos);
+ bms_free(lateral_rels);
+ bms_free(target_rels);
+
+ return get_relids_in_jointree(j->larg, true, false);
+}
+
+/*
+ * Find the outer-query relids that a freshly-pulled-up sublink's rarg
+ * references via LATERAL.
+ *
+ * Only ANY/IN pull-up can leave outer-query references inside the rarg:
+ * convert_ANY_sublink_to_join wraps the subselect in a subquery RTE that
+ * inherits any level-1 Vars referencing outer rels and is marked lateral.
+ * EXISTS pull-up, by contrast, hoists the subselect's whereClause into
+ * j->quals and rewrites its level-1 Vars to level-0; nothing of interest
+ * is left in the rarg, so we early-return NULL for non-RangeTblRef shapes.
+ *
+ * Returns NULL when the rarg has no relevant lateral references. The
+ * caller bms_free's the result.
+ */
+static Relids
+collect_sublink_rarg_lateral_rels(PlannerInfo *root, Node *rarg)
+{
+ RangeTblRef *rtr;
+ RangeTblEntry *rte;
+
+ if (rarg == NULL || !IsA(rarg, RangeTblRef))
+ return NULL;
+
+ rtr = (RangeTblRef *) rarg;
+ rte = rt_fetch(rtr->rtindex, root->parse->rtable);
+
+ if (rte->rtekind != RTE_SUBQUERY || !rte->lateral)
+ return NULL;
+
+ /*
+ * NULL root is intentional: pull_up_sublinks runs before
+ * pull_up_subqueries, so no PlaceHolderVars exist yet and pull_varnos
+ * machinery has nothing to consult root for.
+ */
+ return pull_varnos_of_level(NULL, (Node *) rte->subquery, 1);
+}
+
+/*
+ * Walk down from *slot looking for the deepest position whose subtree
+ * relids cover target_rels, descending only through non-nullable sides.
+ *
+ * Returns a pointer to the slot we should overwrite; this is *slot itself
+ * if no deeper slot is safe.
+ */
+static Node **
+find_sublink_jtlink(Node **slot, Relids target_rels)
+{
+ Node *n;
+
+ Assert(slot != NULL);
+ check_stack_depth();
+
+ n = *slot;
+
+ if (n == NULL)
+ return slot;
+
+ if (IsA(n, RangeTblRef))
+ return slot; /* cannot go deeper */
+
+ if (IsA(n, FromExpr))
+ {
+ FromExpr *f = (FromExpr *) n;
+ ListCell *lc;
+
+ /*
+ * If exactly one fromlist child covers target_rels, descend
into it.
+ * Otherwise target straddles siblings -- insert at this
FromExpr.
+ * Sibling subtrees in a fromlist are disjoint, so at most one
child
+ * can cover target_rels; "first match wins" is therefore
unique.
+ */
+ foreach(lc, f->fromlist)
+ {
+ Node *child = (Node *) lfirst(lc);
+ Relids child_rels =
get_relids_in_jointree(child, true, false);
+
+ if (bms_is_subset(target_rels, child_rels))
+ {
+ bms_free(child_rels);
+
+ /*
+ * Hand the recursive call the address of the
list cell's
+ * payload so it can rewrite the slot in place.
Safe because
+ * find_sublink_jtlink never mutates fromlist
below us.
+ */
+ return find_sublink_jtlink((Node **)
&lfirst(lc),
+
target_rels);
+ }
+ bms_free(child_rels);
+ }
+ return slot;
+ }
+
+ if (IsA(n, JoinExpr))
+ {
+ JoinExpr *j = (JoinExpr *) n;
+ Relids side_rels;
+
+ switch (j->jointype)
+ {
+ case JOIN_INNER:
+ /* Both sides non-nullable -- try larg, then
rarg. */
+ side_rels = get_relids_in_jointree(j->larg,
true, false);
+ if (bms_is_subset(target_rels, side_rels))
+ {
+ bms_free(side_rels);
+ return find_sublink_jtlink(&j->larg,
target_rels);
+ }
+ bms_free(side_rels);
+
+ side_rels = get_relids_in_jointree(j->rarg,
true, false);
+ if (bms_is_subset(target_rels, side_rels))
+ {
+ bms_free(side_rels);
+ return find_sublink_jtlink(&j->rarg,
target_rels);
+ }
+ bms_free(side_rels);
+ return slot; /* spans both sides */
+
+ case JOIN_LEFT:
+ /* Only larg is non-nullable in the surrounding
scope. */
+ side_rels = get_relids_in_jointree(j->larg,
true, false);
+ if (bms_is_subset(target_rels, side_rels))
+ {
+ bms_free(side_rels);
+ return find_sublink_jtlink(&j->larg,
target_rels);
+ }
+ bms_free(side_rels);
+ return slot;
+
+ case JOIN_RIGHT:
+ side_rels = get_relids_in_jointree(j->rarg,
true, false);
+ if (bms_is_subset(target_rels, side_rels))
+ {
+ bms_free(side_rels);
+ return find_sublink_jtlink(&j->rarg,
target_rels);
+ }
+ bms_free(side_rels);
+ return slot;
+
+ case JOIN_FULL:
+ /* Both sides nullable -- never descend. */
+ return slot;
+
+ case JOIN_SEMI:
+ case JOIN_ANTI:
+ /*
+ * SEMI/ANTI joins do not normally appear in
the user's
+ * jointree, but a previous iteration of this
same recursion
+ * (e.g. a sibling AND-conjunct sublink already
pulled up and
+ * inserted by insert_pulled_up_sublink_join)
may have planted
+ * one in our path. We may only descend into
the larg side;
+ * the rarg is the existence-test side and
pre-filtering it
+ * would change semantics.
+ */
+ side_rels = get_relids_in_jointree(j->larg,
true, false);
+ if (bms_is_subset(target_rels, side_rels))
+ {
+ bms_free(side_rels);
+ return find_sublink_jtlink(&j->larg,
target_rels);
+ }
+ bms_free(side_rels);
+ return slot;
+
+ default:
+ return slot;
+ }
+ }
+
+ return slot;
+}
+
/*
* Recurse through top-level qual nodes for pull_up_sublinks()
*
@@ -882,9 +1187,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
if ((j = convert_ANY_sublink_to_join(root, sublink,
false,
available_rels1)) != NULL)
{
+ Relids larg_rels;
+
/* Yes; insert the new join node into the join
tree */
- j->larg = *jtlink1;
- *jtlink1 = (Node *) j;
+ larg_rels = insert_pulled_up_sublink_join(root,
j, jtlink1,
+
available_rels1);
/* Recursively process pulled-up jointree nodes
*/
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -898,7 +1205,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
j->quals = pull_up_sublinks_qual_recurse(root,
j->quals,
&j->larg,
-
available_rels1,
+
larg_rels,
&j->rarg,
child_rels);
/* Return NULL representing constant TRUE */
@@ -908,9 +1215,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
(j = convert_ANY_sublink_to_join(root, sublink,
false,
available_rels2)) != NULL)
{
+ Relids larg_rels;
+
/* Yes; insert the new join node into the join
tree */
- j->larg = *jtlink2;
- *jtlink2 = (Node *) j;
+ larg_rels = insert_pulled_up_sublink_join(root,
j, jtlink2,
+
available_rels2);
/* Recursively process pulled-up jointree nodes
*/
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -924,7 +1233,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
j->quals = pull_up_sublinks_qual_recurse(root,
j->quals,
&j->larg,
-
available_rels2,
+
larg_rels,
&j->rarg,
child_rels);
/* Return NULL representing constant TRUE */
@@ -936,9 +1245,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
if ((j = convert_EXISTS_sublink_to_join(root, sublink,
false,
available_rels1)) != NULL)
{
+ Relids larg_rels;
+
/* Yes; insert the new join node into the join
tree */
- j->larg = *jtlink1;
- *jtlink1 = (Node *) j;
+ larg_rels = insert_pulled_up_sublink_join(root,
j, jtlink1,
+
available_rels1);
/* Recursively process pulled-up jointree nodes
*/
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -952,7 +1263,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
j->quals = pull_up_sublinks_qual_recurse(root,
j->quals,
&j->larg,
-
available_rels1,
+
larg_rels,
&j->rarg,
child_rels);
/* Return NULL representing constant TRUE */
@@ -962,9 +1273,11 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
(j = convert_EXISTS_sublink_to_join(root,
sublink, false,
available_rels2)) != NULL)
{
+ Relids larg_rels;
+
/* Yes; insert the new join node into the join
tree */
- j->larg = *jtlink2;
- *jtlink2 = (Node *) j;
+ larg_rels = insert_pulled_up_sublink_join(root,
j, jtlink2,
+
available_rels2);
/* Recursively process pulled-up jointree nodes
*/
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -978,7 +1291,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
j->quals = pull_up_sublinks_qual_recurse(root,
j->quals,
&j->larg,
-
available_rels2,
+
larg_rels,
&j->rarg,
child_rels);
/* Return NULL representing constant TRUE */
@@ -1003,8 +1316,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
available_rels1)) != NULL)
{
/* Yes; insert the new join node into
the join tree */
- j->larg = *jtlink1;
- *jtlink1 = (Node *) j;
+ (void)
insert_pulled_up_sublink_join(root, j, jtlink1,
+
available_rels1);
/* Recursively process pulled-up
jointree nodes */
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -1029,8 +1342,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
available_rels2)) != NULL)
{
/* Yes; insert the new join node into
the join tree */
- j->larg = *jtlink2;
- *jtlink2 = (Node *) j;
+ (void)
insert_pulled_up_sublink_join(root, j, jtlink2,
+
available_rels2);
/* Recursively process pulled-up
jointree nodes */
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -1057,8 +1370,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
available_rels1)) != NULL)
{
/* Yes; insert the new join node into
the join tree */
- j->larg = *jtlink1;
- *jtlink1 = (Node *) j;
+ (void)
insert_pulled_up_sublink_join(root, j, jtlink1,
+
available_rels1);
/* Recursively process pulled-up
jointree nodes */
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
@@ -1083,8 +1396,8 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
available_rels2)) != NULL)
{
/* Yes; insert the new join node into
the join tree */
- j->larg = *jtlink2;
- *jtlink2 = (Node *) j;
+ (void)
insert_pulled_up_sublink_join(root, j, jtlink2,
+
available_rels2);
/* Recursively process pulled-up
jointree nodes */
j->rarg =
pull_up_sublinks_jointree_recurse(root,
j->rarg,
diff --git a/src/backend/utils/misc/guc_parameters.dat
b/src/backend/utils/misc/guc_parameters.dat
index afaa058b046..c72d0ac9f3f 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1013,6 +1013,13 @@
boot_val => 'true',
},
+{ name => 'enable_pulled_up_semi_relocation', type => 'bool', context =>
'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+ short_desc => 'Enables relocation of pulled-up SEMI/ANTI joins next to their
referenced relations.',
+ flags => 'GUC_EXPLAIN',
+ variable => 'enable_pulled_up_semi_relocation',
+ boot_val => 'true',
+},
+
{ name => 'enable_self_join_elimination', type => 'bool', context =>
'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
short_desc => 'Enables removal of unique self-joins.',
flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample
b/src/backend/utils/misc/postgresql.conf.sample
index ac38cddaaf9..02dcc23bfab 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -446,6 +446,7 @@
#enable_tidscan = on
#enable_group_by_reordering = on
#enable_distinct_reordering = on
+#enable_pulled_up_semi_relocation = on
#enable_self_join_elimination = on
#enable_eager_aggregate = on
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 71c043a25e8..5aa6f307012 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -21,6 +21,7 @@
#define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
extern PGDLLIMPORT double cursor_tuple_fraction;
extern PGDLLIMPORT bool enable_self_join_elimination;
+extern PGDLLIMPORT bool enable_pulled_up_semi_relocation;
/* query_planner callback to compute query_pathkeys */
typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
diff --git a/src/test/regress/expected/partition_join.out
b/src/test/regress/expected/partition_join.out
index 38643d41fd7..5ac4ae266d0 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -723,33 +723,33 @@ SELECT * FROM prt1 t1 JOIN prt1 t2 ON t1.a = t2.a WHERE
t1.a IN (SELECT a FROM p
QUERY PLAN
--------------------------------------------------
Append
- -> Hash Semi Join
- Hash Cond: (t1_1.a = t3_1.a)
- -> Hash Join
- Hash Cond: (t1_1.a = t2_1.a)
+ -> Hash Join
+ Hash Cond: (t1_1.a = t2_1.a)
+ -> Hash Semi Join
+ Hash Cond: (t1_1.a = t3_1.a)
-> Seq Scan on prt1_p1 t1_1
-> Hash
- -> Seq Scan on prt1_p1 t2_1
+ -> Seq Scan on prt1_p1 t3_1
-> Hash
- -> Seq Scan on prt1_p1 t3_1
- -> Hash Semi Join
- Hash Cond: (t1_2.a = t3_2.a)
- -> Hash Join
- Hash Cond: (t1_2.a = t2_2.a)
+ -> Seq Scan on prt1_p1 t2_1
+ -> Hash Join
+ Hash Cond: (t1_2.a = t2_2.a)
+ -> Hash Semi Join
+ Hash Cond: (t1_2.a = t3_2.a)
-> Seq Scan on prt1_p2 t1_2
-> Hash
- -> Seq Scan on prt1_p2 t2_2
+ -> Seq Scan on prt1_p2 t3_2
-> Hash
- -> Seq Scan on prt1_p2 t3_2
- -> Hash Semi Join
- Hash Cond: (t1_3.a = t3_3.a)
- -> Hash Join
- Hash Cond: (t1_3.a = t2_3.a)
+ -> Seq Scan on prt1_p2 t2_2
+ -> Hash Join
+ Hash Cond: (t1_3.a = t2_3.a)
+ -> Hash Semi Join
+ Hash Cond: (t1_3.a = t3_3.a)
-> Seq Scan on prt1_p3 t1_3
-> Hash
- -> Seq Scan on prt1_p3 t2_3
+ -> Seq Scan on prt1_p3 t3_3
-> Hash
- -> Seq Scan on prt1_p3 t3_3
+ -> Seq Scan on prt1_p3 t2_3
(28 rows)
--
diff --git a/src/test/regress/expected/subselect.out
b/src/test/regress/expected/subselect.out
index a3778c23c34..68e728054eb 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -3785,3 +3785,138 @@ WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
(0 rows)
ROLLBACK;
+--
+-- Pulled-up SEMI/ANTI joins should be grafted next to the rel they
+-- reference, so they are not stranded above join_collapse_limit when
+-- many sibling joins exist.
+--
+BEGIN;
+CREATE TEMP TABLE pull_t1 (a int, b int);
+CREATE TEMP TABLE pull_t2 (a int);
+CREATE TEMP TABLE pull_t3 (a int);
+CREATE TEMP TABLE pull_t4 (a int);
+CREATE TEMP TABLE pull_t5 (a int);
+CREATE TEMP TABLE pull_x (b int);
+INSERT INTO pull_t1 SELECT g, g FROM generate_series(1, 10) g;
+INSERT INTO pull_t2 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t3 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t4 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t5 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_x SELECT g FROM generate_series(1, 10) g;
+ANALYZE pull_t1, pull_t2, pull_t3, pull_t4, pull_t5, pull_x;
+SET LOCAL join_collapse_limit = 2;
+SET LOCAL max_parallel_workers_per_gather = 0;
+-- Pin a single join method so the test asserts the SEMI/ANTI graft
+-- point and not the planner's hash-vs-merge cost crossover, which has
+-- been observed to flap on 32-bit and non-default-blocksize buildfarm
+-- members. enable_memoize is pinned for the same reason: a future
+-- planner change that prefers Memoize over Materialize on the inner
+-- side of the SEMI would break the expected output.
+SET LOCAL enable_hashjoin = off;
+SET LOCAL enable_mergejoin = off;
+SET LOCAL enable_memoize = off;
+-- The EXISTS only references pull_t1, so the SEMI must end up adjacent
+-- to pull_t1 even though four other joins separate the WHERE from t1.
+-- The Nested Loop with pull_t1 directly underneath the Semi Join is
+-- the load-bearing assertion here.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+ QUERY PLAN
+---------------------------------------------------------------
+ Nested Loop
+ Join Filter: (pull_t1.a = pull_t5.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t4.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t3.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t2.a)
+ -> Nested Loop Semi Join
+ Join Filter: (pull_t1.b = pull_x.b)
+ -> Seq Scan on pull_t1
+ -> Materialize
+ -> Seq Scan on pull_x
+ -> Materialize
+ -> Seq Scan on pull_t2
+ -> Materialize
+ -> Seq Scan on pull_t3
+ -> Materialize
+ -> Seq Scan on pull_t4
+ -> Materialize
+ -> Seq Scan on pull_t5
+(21 rows)
+
+-- The IN's testexpr only references pull_t1, same expectation.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE pull_t1.b IN (SELECT b FROM pull_x);
+ QUERY PLAN
+---------------------------------------------------------------
+ Nested Loop
+ Join Filter: (pull_t1.a = pull_t5.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t4.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t3.a)
+ -> Nested Loop
+ Join Filter: (pull_t1.a = pull_t2.a)
+ -> Nested Loop Semi Join
+ Join Filter: (pull_t1.b = pull_x.b)
+ -> Seq Scan on pull_t1
+ -> Materialize
+ -> Seq Scan on pull_x
+ -> Materialize
+ -> Seq Scan on pull_t2
+ -> Materialize
+ -> Seq Scan on pull_t3
+ -> Materialize
+ -> Seq Scan on pull_t4
+ -> Materialize
+ -> Seq Scan on pull_t5
+(21 rows)
+
+-- An EXISTS referencing a rel on the nullable side of a LEFT JOIN
+-- must NOT be pushed past the outer join.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t2
+LEFT JOIN pull_t1 ON pull_t1.a = pull_t2.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+ QUERY PLAN
+----------------------------------------------
+ Nested Loop Semi Join
+ Join Filter: (pull_t1.b = pull_x.b)
+ -> Nested Loop
+ Join Filter: (pull_t2.a = pull_t1.a)
+ -> Seq Scan on pull_t2
+ -> Materialize
+ -> Seq Scan on pull_t1
+ -> Materialize
+ -> Seq Scan on pull_x
+(9 rows)
+
+-- An IN-sublink whose pulled-up rarg carries a LATERAL reference to a
+-- second outer rel must not be grafted into a subtree that excludes
+-- that rel, even though only pull_t1 appears in the testexpr. The
+-- result set must match the unoptimised plan.
+SELECT count(*)
+FROM pull_t1, pull_t2
+WHERE pull_t1.a = pull_t2.a
+ AND pull_t1.b IN (SELECT b FROM pull_x WHERE pull_x.b = pull_t2.a);
+ count
+-------
+ 10
+(1 row)
+
+ROLLBACK;
diff --git a/src/test/regress/sql/subselect.sql
b/src/test/regress/sql/subselect.sql
index 1a02c3f86c0..09e9ed747ca 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -1647,3 +1647,79 @@ SELECT * FROM not_null_tab
WHERE id NOT IN (SELECT id FROM notnull_notvalid_tab);
ROLLBACK;
+
+--
+-- Pulled-up SEMI/ANTI joins should be grafted next to the rel they
+-- reference, so they are not stranded above join_collapse_limit when
+-- many sibling joins exist.
+--
+BEGIN;
+
+CREATE TEMP TABLE pull_t1 (a int, b int);
+CREATE TEMP TABLE pull_t2 (a int);
+CREATE TEMP TABLE pull_t3 (a int);
+CREATE TEMP TABLE pull_t4 (a int);
+CREATE TEMP TABLE pull_t5 (a int);
+CREATE TEMP TABLE pull_x (b int);
+
+INSERT INTO pull_t1 SELECT g, g FROM generate_series(1, 10) g;
+INSERT INTO pull_t2 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t3 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t4 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_t5 SELECT g FROM generate_series(1, 10) g;
+INSERT INTO pull_x SELECT g FROM generate_series(1, 10) g;
+ANALYZE pull_t1, pull_t2, pull_t3, pull_t4, pull_t5, pull_x;
+
+SET LOCAL join_collapse_limit = 2;
+SET LOCAL max_parallel_workers_per_gather = 0;
+-- Pin a single join method so the test asserts the SEMI/ANTI graft
+-- point and not the planner's hash-vs-merge cost crossover, which has
+-- been observed to flap on 32-bit and non-default-blocksize buildfarm
+-- members. enable_memoize is pinned for the same reason: a future
+-- planner change that prefers Memoize over Materialize on the inner
+-- side of the SEMI would break the expected output.
+SET LOCAL enable_hashjoin = off;
+SET LOCAL enable_mergejoin = off;
+SET LOCAL enable_memoize = off;
+
+-- The EXISTS only references pull_t1, so the SEMI must end up adjacent
+-- to pull_t1 even though four other joins separate the WHERE from t1.
+-- The Nested Loop with pull_t1 directly underneath the Semi Join is
+-- the load-bearing assertion here.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+
+-- The IN's testexpr only references pull_t1, same expectation.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t1
+JOIN pull_t2 ON pull_t1.a = pull_t2.a
+JOIN pull_t3 ON pull_t1.a = pull_t3.a
+JOIN pull_t4 ON pull_t1.a = pull_t4.a
+JOIN pull_t5 ON pull_t1.a = pull_t5.a
+WHERE pull_t1.b IN (SELECT b FROM pull_x);
+
+-- An EXISTS referencing a rel on the nullable side of a LEFT JOIN
+-- must NOT be pushed past the outer join.
+EXPLAIN (COSTS OFF)
+SELECT 1
+FROM pull_t2
+LEFT JOIN pull_t1 ON pull_t1.a = pull_t2.a
+WHERE EXISTS (SELECT 1 FROM pull_x WHERE pull_x.b = pull_t1.b);
+
+-- An IN-sublink whose pulled-up rarg carries a LATERAL reference to a
+-- second outer rel must not be grafted into a subtree that excludes
+-- that rel, even though only pull_t1 appears in the testexpr. The
+-- result set must match the unoptimised plan.
+SELECT count(*)
+FROM pull_t1, pull_t2
+WHERE pull_t1.a = pull_t2.a
+ AND pull_t1.b IN (SELECT b FROM pull_x WHERE pull_x.b = pull_t2.a);
+
+ROLLBACK;
--
2.54.0