On Wed, Jan 14, 2026 at 11:11 PM Robert Haas <[email protected]> wrote:
Hi Robert,
> On Wed, Jan 14, 2026 at 6:02 AM Jakub Wartak
> <[email protected]> wrote:
> > a) q4.sql (please see attached file for repro). More or less: right
> > after import I get a hard failure if the earlier recommended advice is
> > enabled (smells like a bug to me: we shouldn't get any errors even if
> > advice is bad). This can be solved by ANALYZE, but brought up back by
> > truncating pg_statistics
> > ERROR: unique semijoin found for relids (b 3) but not observed during
> > planning
> > STATEMENT: explain (timing off, costs off, settings off, memory off)
>
> Hmm, so the plan tree walker thinks that we did a semijoin between
> lineitem and orders by making lineitem unique on the join column and
> then performing a regular join. That appears to be correct. But
> pgpa_join_path_setup never created a pgpa_join_path_setup for that
> possibility, or created one that doesn't actually match up properly to
> what was found in the plan tree. Can you check whether a
> pgpa_sj_unique_rel gets created in pgpa_join_path_setup, and with what
> contents?
OK, so today, on just barebone v9 (even without any fixes from this $subthread),
I couldn't get it to reproduce right out of the box right on the fresh
cluster. It
appears to another missing piece of the puzzle was to have
max_parallel_workers_per_gather=0 (in addition to TRUNCATING pg_statistic),
because otherwise it did not want to generate advice out of the box that would
generate this specific ERROR.
Maybe I'm big rookie here (OR just dumb), but it took me some time to realize
why we emit SEMIJOIN_UNIQUE() there, clearly the plan without parallelism
has "Nested Loop", not like "Nested Loop Semi Join"
(with max_parallel_workers_per_gather = 2). Yet it emits that and somehow
later the query feature walker->query_features[PGPAQF_SEMIJOIN_UNIQUE]
also is there, so that "unique semijoin found for.." error could be thrown.
As per VERBOSE explain, one can spot this SemiJoin transformation is being
applied (Nested Loop/Inner Unique: true), however sj_unique_rtis is empty (?!):
[..]
NOTICE: jointype=outer pps_NULL?=0
NOTICE: added SEMIJOIN_UNIQUE
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<> sj_unique_rels=<>
ERROR: unique semijoin found for relids (b 3) but not observed during planning
Only *after* this, I've realized how hard all of that is reading comments nearby
`typedef struct pgpa_sj_unique_rel`.
Anyway it appears that pgpa_plan_walker()/pgpa_planner_walker() is not
having proper
input information to begin with about SJs? It looks there is just one
single place that
sets pps->sj_unique_rels (lappend() in pgpa_join_path_setup()), but
that's code path is
only being launched when requesting explain is asking for advice:
explain (costs off, plan_advice) SELECT
[..]
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a duplicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: added SEMIJOIN_UNIQUE
WARNING: could not dump unrecognized node type: 0 // ignore?
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=((b 3)) sj_unique_rels=({})
(+ no error!)
while with basic EXPLAIN (and advices planner/advises touching SJ
transforms), I'm getting:
dbt3=# explain (costs off) SELECT
[..]
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: added SEMIJOIN_UNIQUE
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<> sj_unique_rels=<>
ERROR: unique semijoin found for relids (b 3) but not observed during planning
So we have started v9 with:
if (pps->generate_advice_string) { -- but that's wrong due to
potential crash in -02 builds + asan complaints
we fixed that above bug with:
if (pps != NULL && pps->generate_advice_string) { -- but that's
wrong due to not initializing SJ for normal explains
so we end up doing simply this?
if (pps != NULL) {
The last one seems to pass all my tests (with already provided fixup
from yesterday), but I'm absolutely not sure if that's the proper way to
address that).
> > a) q8.sql (please see attached file for demo). It is even more
> > bizarre, happens right after import , fixed by ANALYZE, but even
> > TRUNCATING pg_statistic doesnt bring back the problem. Pinpointed that
> > additional pg_clear_relation_stats() triggers the problem back.
>
> I found this one. I now think that
> pgpa_planner_apply_join_path_advice() shouldn't added anything to
> jo_permit_indexes when a join method hint implicitly permits a join
> order. I simplified your test case to this:
>
> set pg_plan_advice.advice = 'JOIN_ORDER(n1 region customer)
> NESTED_LOOP_PLAIN(region)';
> explain (costs off, plan_advice)
> SELECT
> n1.n_name AS nation
> FROM
> customer,
> nation n1,
> region
> WHERE
> c_nationkey = n1.n_nationkey
> AND n1.n_regionkey = r_regionkey
> AND r_name = 'AMERICA';
>
> What was happening here is that when we considered a join between
> {customer, nation} and region, pgpa_planner_apply_join_path_advice()
> said, well, according to the JOIN_ORDER advice, this join order is not
> allowed, which is correct. And, according to the NESTED_LOOP_PLAIN
> advice, this join order is allowed, which is also correct, because
> NESTED_LOOP_PLAIN(region) denies join orders where region is the
> driving table, since those would make it impossible to respect the
> advice, and this join order doesn't do that. Then, it concludes that
> because one piece of advice says the join order is OK and the other
> says it isn't, the advice conflicts. This is where I think it's going
> off the rails: the NESTED_LOOP_PLAIN() advice should only be allowed
> to act as a negative constraint, not a positive one. So what I did is:
Yes! 3 three lines patches seems to help (and causes no other problems to
best of my knowledge). The simplified test case results seem to be only
changing like below, but it really fixes the Q8 NL->HJ.
Supplied Plan Advice:
- JOIN_ORDER(n1 region customer) /* matched, conflicting */
- NESTED_LOOP_PLAIN(region) /* matched, conflicting */
+ JOIN_ORDER(n1 region customer) /* matched */
+ NESTED_LOOP_PLAIN(region) /* matched */
BTW: I have found also that it fixes another (not yet here disclosed bug,
because I've found it just today :): when running without stats, and
with enable_nestloop=OFF (globally) Q5 was failing too due some
sequence of HJ/Parallel HJ
and slightly different Hash Cond) - nvm, wIth this patch it does NOT misbehave.
Maybe it would be good to include that into tests inside 0005, for that
small tiny query above?
> > 3b) XXX - marker:I was looking for a solution and apparently cfbot
> > farm has those options, so they should be testing it anyway. And this
> > brings me to a fact, that it maybe could be detected by cfbot, however
> > the $thread is not registered so cfbot had no chance to see what's
> > more there? (I'm mainly thinking about any cross-platform issues, if
> > any).
>
> I mean, there is https://commitfest.postgresql.org/patch/6184/
Whoops, mea culpa, I was looking for PG-4 commitfest for some reason
(so I should
be looking on https://cfbot.cputube.org/next.html not just under "/"
[main] one).
-J.
--
-- step by step, demo that it doesnt work:
--
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders) SEQ_SCAN(lineitem) INDEX_SCAN(orders
public.pk_orders) GATHER(lineitem) SEMIJOIN_UNIQUE(lineitem) NO_GATHER(orders)';
SET
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
ERROR: unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
-- so with max_parallel_workers_per_gather=2 works (-> NO ERROR)
set pg_plan_advice.advice = 'JOIN_ORDER(orders lineitem)
NESTED_LOOP_PLAIN(lineitem)
SEQ_SCAN(orders)
INDEX_ONLY_SCAN(lineitem public.lineitem_l_orderkey_l_suppkey_idx)
GATHER_MERGE((orders lineitem))
SEMIJOIN_NON_UNIQUE(lineitem)';
-- when set to 0, this fails:
set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)';
-- note the SEMIJOIN_UNIQUE(lineitem) vs SEMIJOIN_NON_UNIQUE(lineitem), as it
is going to be important later
---
--- FRESH
---
dbt3=# explain (costs off,plan_advice) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop
-> HashAggregate
Group Key: lineitem.l_orderkey
-> Seq Scan on lineitem
Filter: (l_commitdate < l_receiptdate)
-> Index Scan using pk_orders on orders
Index Cond: (o_orderkey = lineitem.l_orderkey)
Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Generated Plan Advice:
JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)
(19 rows)
--
-- quick sanity check if we really see the error (copy/paste from above):
--
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
ERROR: unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
-- yes , we do so "b 3" seems to be HashAggregate/Group Key: lineitem.l_orderkey
-- so how's that NOT observed during planning if it is there?
dbt3=# LOAD 'pg_overexplain';
LOAD
dbt3=# explain (costs off, DEBUG) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Group Key: orders.o_orderpriority
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 0
-> Sort
Sort Key: orders.o_orderpriority
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 1
-> Nested Loop
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 2
-> HashAggregate
Group Key: lineitem.l_orderkey
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 3
-> Seq Scan on lineitem
Filter: (l_commitdate < l_receiptdate)
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 4
-> Index Scan using pk_orders on orders
Index Cond: (o_orderkey = lineitem.l_orderkey)
Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Disabled Nodes: 0
Parallel Safe: false
Plan Node ID: 5
extParam: 0
allParam: 0
PlannedStmt:
Command Type: select
Flags: canSetTag
Subplans Needing Rewind: none
Relation OIDs: 17089 17061
Executor Parameter Types: 20
Parse Location: 0 to end
(39 rows)
dbt3=# explain (costs off, RANGE_TABLE) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop
-> HashAggregate
Group Key: lineitem.l_orderkey
-> Seq Scan on lineitem
Filter: (l_commitdate < l_receiptdate)
Scan RTI: 3
-> Index Scan using pk_orders on orders
Index Cond: (o_orderkey = lineitem.l_orderkey)
Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Scan RTI: 1
RTI 1 (relation, in-from-clause):
Eref: orders (o_orderkey, o_custkey, o_orderstatus, o_totalprice,
o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment)
Relation: orders
Relation Kind: relation
Relation Lock Mode: AccessShareLock
Permission Info Index: 1
RTI 2 (group):
Eref: "*GROUP*" (o_orderpriority)
RTI 3 (relation, in-from-clause):
Eref: lineitem (l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity,
l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate,
l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment)
Relation: lineitem
Relation Kind: relation
Relation Lock Mode: AccessShareLock
Permission Info Index: 2
Unprunable RTIs: 1 3
(29 rows)
dbt3=#
--
-- 3x conflicts (still with using SEMIJOIN_UNIQUE):
--
dbt3=# explain (costs off,plan_advice) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop
-> HashAggregate
Group Key: lineitem.l_orderkey
-> Seq Scan on lineitem
Filter: (l_commitdate < l_receiptdate)
-> Index Scan using pk_orders on orders
Index Cond: (o_orderkey = lineitem.l_orderkey)
Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Supplied Plan Advice:
SEQ_SCAN(lineitem) /* matched */
INDEX_SCAN(orders public.pk_orders) /* matched */
JOIN_ORDER(lineitem orders) /* matched, conflicting */
NESTED_LOOP_PLAIN(orders) /* matched, conflicting */
SEMIJOIN_UNIQUE(lineitem) /* matched, conflicting */
NO_GATHER(orders) /* matched */
NO_GATHER(lineitem) /* matched */
Generated Plan Advice:
JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)
(27 rows)
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
ERROR: unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
-- still if I remove SEMIJOIN_UNIQUE(liteitem) it complains:
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
ERROR: unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
-- not shown here but also using SEMIJOIN_NON_UNIQUE was failing , which made
me struggle
-- so back to parallel plan:
dbt3=# set pg_plan_advice.advice = '';
SET
dbt3=# set max_parallel_workers_per_gather to 2;
SET
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: orders.o_orderpriority
-> Gather Merge
Workers Planned: 2
-> Partial GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop Semi Join
-> Parallel Seq Scan on orders
Filter: ((o_orderdate >= '1993-07-01'::date)
AND (o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
-> Index Only Scan using
lineitem_l_orderkey_l_suppkey_idx on lineitem
Index Cond: (l_orderkey = orders.o_orderkey)
(13 rows)
dbt3=# explain (costs off, plan_advice) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate
Group Key: orders.o_orderpriority
-> Gather Merge
Workers Planned: 2
-> Partial GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop Semi Join
-> Parallel Seq Scan on orders
Filter: ((o_orderdate >= '1993-07-01'::date)
AND (o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
-> Index Only Scan using
lineitem_l_orderkey_l_suppkey_idx on lineitem
Index Cond: (l_orderkey = orders.o_orderkey)
Generated Plan Advice:
JOIN_ORDER(orders lineitem)
NESTED_LOOP_PLAIN(lineitem)
SEQ_SCAN(orders)
INDEX_ONLY_SCAN(lineitem public.lineitem_l_orderkey_l_suppkey_idx)
GATHER_MERGE((orders lineitem))
SEMIJOIN_NON_UNIQUE(lineitem)
-- back to max_parallel_workers_per_gather=0 and explain VERBOSE
-- this shows that indeed we have Nested Loop // Inner Unique: true
dbt3=# explain (costs off, VERBOSE, plan_advice) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Output: orders.o_orderpriority, count(*)
Group Key: orders.o_orderpriority
-> Sort
Output: orders.o_orderpriority
Sort Key: orders.o_orderpriority
-> Nested Loop
Output: orders.o_orderpriority
Inner Unique: true
-> HashAggregate
Output: lineitem.l_orderkey
Group Key: lineitem.l_orderkey
-> Seq Scan on public.lineitem
Output: lineitem.l_orderkey, lineitem.l_partkey,
lineitem.l_suppkey, lineitem.l_linenumber, lineitem.l_quantity,
lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_tax,
lineitem.l_returnflag, lineitem.l_linestatus, lineitem.l_shipdate,
lineitem.l_commitdate, lineitem.l_receiptdate, lineitem.l_shipinstruct,
lineitem.l_shipmode, lineitem.l_comment
Filter: (lineitem.l_commitdate <
lineitem.l_receiptdate)
-> Index Scan using pk_orders on public.orders
Output: orders.o_orderkey, orders.o_custkey,
orders.o_orderstatus, orders.o_totalprice, orders.o_orderdate,
orders.o_orderpriority, orders.o_clerk, orders.o_shippriority, orders.o_comment
Index Cond: (orders.o_orderkey = lineitem.l_orderkey)
Filter: ((orders.o_orderdate >= '1993-07-01'::date) AND
(orders.o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Generated Plan Advice:
JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)
-----
----- finally to realize that sj_unique_rtis is NULL there?
-----
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)';
SET
dbt3=# explain (costs off, verbose) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: added SEMIJOIN_UNIQUE
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<>
ERROR: unique semijoin found for relids (b 3) but not observed during planning
----------
----------
----------
dbt3=# set pg_plan_advice.advice = 'JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)';
SET
dbt3=#
dbt3=#
dbt3=# explain (costs off) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: jointype=inner pps_NULL?=0
NOTICE: jointype=outer pps_NULL?=0
NOTICE: added SEMIJOIN_UNIQUE
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=<> sj_unique_rels=<>
ERROR: unique semijoin found for relids (b 3) but not observed during planning
dbt3=#
dbt3=#
dbt3=# explain (costs off, plan_advice) SELECT
o_orderpriority,
count(*) AS order_count
FROM
orders
WHERE
o_orderdate >= DATE '1993-07-01'
AND o_orderdate < DATE '1993-07-01' + INTERVAL '3' MONTH
AND EXISTS (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
AND l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority;
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=0
NOTICE: not a dupllicate, appending "(b 3)" to pps->sj_unique_rels
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=inner pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: jointype=outer pps_NULL?=0
NOTICE: found=true! (ur->plan_name=(null) bms_ur_relids=3)
NOTICE: found=1
NOTICE: added SEMIJOIN_UNIQUE
WARNING: could not dump unrecognized node type: 0
NOTICE: pgpa_plan_walker: walking over SEMIJOIN_UNIQUE features: 3,
sj_unique_rtis=((b 3)) sj_unique_rels=({})
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate
Group Key: orders.o_orderpriority
-> Sort
Sort Key: orders.o_orderpriority
-> Nested Loop
-> HashAggregate
Group Key: lineitem.l_orderkey
-> Seq Scan on lineitem
Filter: (l_commitdate < l_receiptdate)
-> Index Scan using pk_orders on orders
Index Cond: (o_orderkey = lineitem.l_orderkey)
Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone))
Supplied Plan Advice:
SEQ_SCAN(lineitem) /* matched */
INDEX_SCAN(orders public.pk_orders) /* matched */
JOIN_ORDER(lineitem orders) /* matched, conflicting */
NESTED_LOOP_PLAIN(orders) /* matched, conflicting */
SEMIJOIN_UNIQUE(lineitem) /* matched, conflicting */
NO_GATHER(orders) /* matched */
NO_GATHER(lineitem) /* matched */
Generated Plan Advice:
JOIN_ORDER(lineitem orders)
NESTED_LOOP_PLAIN(orders)
SEQ_SCAN(lineitem)
INDEX_SCAN(orders public.pk_orders)
SEMIJOIN_UNIQUE(lineitem)
NO_GATHER(orders lineitem)
(27 rows)
diff --git a/contrib/pg_plan_advice/pgpa_planner.c
b/contrib/pg_plan_advice/pgpa_planner.c
index fdc53da9742..a5b784fd151 100644
--- a/contrib/pg_plan_advice/pgpa_planner.c
+++ b/contrib/pg_plan_advice/pgpa_planner.c
@@ -457,7 +457,11 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo
*joinrel,
uniquerel = jointype == JOIN_UNIQUE_OUTER ? outerrel : innerrel;
pps = GetPlannerGlobalExtensionState(root->glob,
planner_extension_id);
- if (pps->generate_advice_string)
+ elog(NOTICE, "jointype=%s pps_NULL?=%d", jointype ==
JOIN_UNIQUE_OUTER ? "outer" : "inner", pps == NULL);
+ // JW: pps NULL fix
+ //if (pps != NULL && pps->generate_advice_string)
+ // JW: pps NULL fix with generating SJ also for normal EXPLAIN
+ if (pps != NULL)
{
bool found = false;
@@ -471,19 +475,27 @@ pgpa_join_path_setup(PlannerInfo *root, RelOptInfo
*joinrel,
if (root->plan_name == ur->plan_name &&
bms_equal(uniquerel->relids,
ur->relids))
{
+ elog(NOTICE, "found=true!
(ur->plan_name=%s bms_ur_relids=%s)",
+ ur->plan_name,
pgpa_bms_to_cstring(ur->relids));
found = true;
break;
}
}
+ elog(NOTICE, "found=%d", found);
/* If not a duplicate, append to the list. */
if (!found)
{
+ StringInfoData buf;
+
pgpa_sj_unique_rel *ur =
palloc_object(pgpa_sj_unique_rel);
ur->plan_name = root->plan_name;
ur->relids = uniquerel->relids;
pps->sj_unique_rels =
lappend(pps->sj_unique_rels, ur);
+ initStringInfo(&buf);
+ outBitmapset(&buf, uniquerel->relids);
+ elog(NOTICE, "not a dupllicate, appending
\"%s\" to pps->sj_unique_rels", buf.data);
}
}
}
@@ -982,7 +994,8 @@ pgpa_planner_apply_join_path_advice(JoinType jointype,
uint64 *pgs_mask_p,
jo_deny_indexes =
bms_add_member(jo_deny_indexes, i);
else if (restrict_method)
{
- jo_permit_indexes =
bms_add_member(jo_permit_indexes, i);
+ //JW
+ //jo_permit_indexes =
bms_add_member(jo_permit_indexes, i);
jm_indexes = bms_add_member(jo_permit_indexes,
i);
if (join_mask != 0 && join_mask != my_join_mask)
jm_conflict = true;
@@ -1038,8 +1051,9 @@ pgpa_planner_apply_join_path_advice(JoinType jointype,
uint64 *pgs_mask_p,
}
else if (advice_unique != jt_unique)
jo_deny_indexes =
bms_add_member(jo_deny_indexes, i);
- else
- jo_permit_indexes =
bms_add_member(jo_permit_indexes, i);
+ //JW
+ //else
+ // jo_permit_indexes =
bms_add_member(jo_permit_indexes, i);
}
continue;
}
diff --git a/contrib/pg_plan_advice/pgpa_walker.c
b/contrib/pg_plan_advice/pgpa_walker.c
index 29973c93b0b..07db28630a8 100644
--- a/contrib/pg_plan_advice/pgpa_walker.c
+++ b/contrib/pg_plan_advice/pgpa_walker.c
@@ -59,6 +59,31 @@ static Index pgpa_walker_get_rti(Index rtable_length,
pgpa_identifier *rt_identifiers,
pgpa_identifier *rid);
+
+/*
+ * Convert a bitmapset to a C string of comma-separated integers.
+ */
+static char *
+pgpa_bms_to_cstring(Bitmapset *bms)
+{
+ StringInfoData buf;
+ int x = -1;
+
+ if (bms_is_empty(bms))
+ return "none";
+
+ initStringInfo(&buf);
+ while ((x = bms_next_member(bms, x)) >= 0)
+ {
+ if (buf.len > 0)
+ appendStringInfo(&buf, ", %d", x);
+ else
+ appendStringInfo(&buf, "%d", x);
+ }
+
+ return buf.data;
+}
+
/*
* Top-level entrypoint for the plan tree walk.
*
@@ -135,6 +160,9 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker,
PlannedStmt *pstmt,
foreach_ptr(pgpa_query_feature, qf,
walker->query_features[PGPAQF_SEMIJOIN_NON_UNIQUE])
{
+ elog(NOTICE, "pgpa_plan_walker: walking over
SEMIJOIN_NON_UNIQUE features: %s",
+ pgpa_bms_to_cstring(qf->relids));
+
if (list_member(sj_unique_rtis, qf->relids))
sj_nonunique_qfs = lappend(sj_nonunique_qfs, qf);
}
@@ -148,6 +176,8 @@ pgpa_plan_walker(pgpa_plan_walker_context *walker,
PlannedStmt *pstmt,
foreach_ptr(pgpa_query_feature, qf,
walker->query_features[PGPAQF_SEMIJOIN_UNIQUE])
{
+ elog(NOTICE, "pgpa_plan_walker: walking over SEMIJOIN_UNIQUE
features: %s, sj_unique_rtis=%s sj_unique_rels=%s",
+ pgpa_bms_to_cstring(qf->relids),
nodeToString(sj_unique_rtis), nodeToString(sj_unique_rels));
if (!list_member(sj_unique_rtis, qf->relids))
{
StringInfoData buf;
@@ -479,6 +509,11 @@ pgpa_add_future_feature(pgpa_plan_walker_context *walker,
walker->future_query_features =
lappend(walker->future_query_features, qf);
+
+ if(type == PGPAQF_SEMIJOIN_NON_UNIQUE)
+ elog(NOTICE, "added SEMIJOIN_NON_UNIQUE");
+ else if(type == PGPAQF_SEMIJOIN_UNIQUE)
+ elog(NOTICE, "added SEMIJOIN_UNIQUE");
}
/*