On Mon, 2026-02-23 at 16:10 +0100, Attila Soki wrote:
> > On 23 Feb 2026, at 10:41, Laurenz Albe <[email protected]> wrote:
> >
> > On Mon, 2026-02-23 at 10:37 +0100, Attila Soki wrote:
> > > When upgrading from PostgreSQL 14.4, I noticed that one of my somewhat
> > > complex
> > > analytical queries sometimes gets an inefficient plan under PostgreSQL
> > > 16, 17, and 18.
> > > Under 14.4, the query runs with a stable plan and completes in 19 to 22
> > > seconds.
> > > In newer versions, the plan seems to be unstable, sometimes the query
> > > completes
> > > in 17 to 20 seconds, sometimes it runs for 5 to 18 minutes with the
> > > inefficient plan.
> > > This also happens even if the data is not significantly changed.
> >
> > This is very likely owing to a bad estimate.
> >
> > Could you turn on "track_io_timing" and send us the EXPLAIN (ANALYZE,
> > BUFFERS) output
> > for both the good and the bad plan?
>
> Thank you for your reply. Here are the two explains.
> In order to be able to publish the plans here, I have obfuscated the table
> and field names, but this is reversible, so I can provide more info if needed.
>
> plan-ok:
> https://explain.depesz.com/s/hQvM
>
> plan-wrong:
> https://explain.depesz.com/s/uLvl
Thanks.
The difference in the plans is under the "Subquery Scan on odg", starting with
plan node 50 (everything under the "Sort"). I suspect that the mis-estimate
that is at the root of the problem is here:
-> Index Scan using table_k_late_spec_dp_end_dat_key on schema1.table_k kal
(... rows=196053 ...) (... rows=471.00 ...)
Index Cond: (kal.dp_end_dat < ('now'::cstring)::date)
Index Searches: 1
Buffers: shared hit=230 read=49
I/O Timings: shared read=0.142
PostgreSQL overestimates the row count by a factor of over 400.
Try to fix that estimate and see if that gets PostgreSQL to do the right thing.
Perhaps a simple ANALYZE on the table can do the trick.
The right side of the comparison looks awkward, as if you wrote
'now'::text::date
My experiments show that PostgreSQL v18 estimates well even with such a weird
condition, but perhaps if you write "current_date" instead, you'd get better
results.
I'd play just with a query like
EXPLAIN (ANALYZE)
SELECT * FROM schema1.table_k AS kal
WHERE dp_end_dat < current_date;
until I get a good estimate.
Yours,
Laurenz Albe