Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)

Mauro Gatti Fri, 06 Mar 2026 00:24:41 -0800

 Andrei, Tom, thank you for the explanation.

I had already tried using extended statistics but without success; I
probably did something wrong.

After creating statistics on all the tables, I’m now getting optimal
timings in production.

This is not the ideal solution because this issue will delay the PostgreSQL
version upgrade:
our application will also need to be tested with respect to response times,
and we will need to
create dedicated statistics depending on the specific cases. In addition,
there may be cases
that slip through testing and then surface in production due to a different
data distribution.

Based on your explanation, my take is that the planner is not able to
produce adequate estimates
when there are filters on many columns; when there are millions of rows
involved this become
noticeable and extended statistics may be needed. One or more commits in
PG16 changed the context
sligthly and caused the unfortunate path to be chosen.

In any case, I don’t think there are other workable solutions without
changing the code.
I get good results using CTEs or LATERAL with OFFSET 0, but the point was
to find
a solution without having to change the code.

Thanks again, regards.

Mauro Gatti

Il giorno gio 5 mar 2026 alle ore 22:44 Tom Lane <[email protected]> ha
scritto:

> Andrei Lepikhov <[email protected]> writes:
> > On 5/3/26 17:25, Mauro Gatti wrote:
> >> ## Questions for the community
>
> > Thanks for stable reproduction!
>
> Yes, we appreciate that much effort being put into trouble reports.
> Makes it a lot easier to see what's going wrong.
>
> > Your case is typical for 'never executed' nodes. As you can see, the
> > costs of your query plans are very close, and the estimation error is
> > large due to multiple clauses in your filter. As I see, for the planner,
> > there is no difference in which version of the plan to choose - it is
> > just a game of chance.
>
> Right.  Given the very-far-off rowcount estimates for some of the
> index scans, it'd be surprising if the planner arrived at a good
> join order.  It's a "garbage in, garbage out" situation.  As Andrei
> suggested, you can often improve bad rowcount estimates by creating
> custom statistics.
>
> I found it was sufficient to do
>
> =# create statistics on
> brand_id,line_code,model_year,model_code,version_code,pricelist_id from
> pricelist_options;
> CREATE STATISTICS
> =# analyze pricelist_options;
> ANALYZE
>
> That doesn't result in fully accurate estimates:
>
> =# explain analyze select * from pricelist_options pl where ((pl.brand_id
> = 10) AND ((pl.line_code)::text = 'ABC'::text) AND ((pl.model_year)::text =
> 'YR'::text) AND ((pl.model_code)::text = 'ABC'::text) AND
> ((pl.version_code)::text = 'VER-001'::text) AND (pl.pricelist_id = 100));
>
>                               QUERY PLAN
>
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using ix_pricelist_options_1 on pricelist_options pl
> (cost=0.43..117.12 rows=28 width=106) (actual time=0.015..0.037 rows=165.00
> loops=1)
>    Index Cond: ((brand_id = 10) AND ((line_code)::text = 'ABC'::text) AND
> ((model_year)::text = 'YR'::text) AND ((model_code)::text = 'ABC'::text)
> AND ((version_code)::text = 'VER-001'::text) AND (pricelist_id = 100))
>
> but "28 rows" is a lot closer to 165 than "1 row", and it's enough
> to push the planner to choose the plan you want.
>
> I do concur with Andrei's recommendation to create stats matching
> your other multicolumn indexes, though.
>
>                         regards, tom lane
>

Re: Planner join order regression from PG 15 to PG 16+: 70ms -> 1440ms (self-contained reproducer included)

Reply via email to