On Mon, 2026-02-23 at 16:10 +0100, Attila Soki wrote:
> > On 23 Feb 2026, at 10:41, Laurenz Albe <[email protected]> wrote:
> > 
> > On Mon, 2026-02-23 at 10:37 +0100, Attila Soki wrote:
> > > When upgrading from PostgreSQL 14.4, I noticed that one of my somewhat 
> > > complex
> > > analytical queries sometimes gets an inefficient plan under PostgreSQL 
> > > 16, 17, and 18.
> > > Under 14.4, the query runs with a stable plan and completes in 19 to 22 
> > > seconds.
> > > In newer versions, the plan seems to be unstable, sometimes the query 
> > > completes
> > > in 17 to 20 seconds, sometimes it runs for 5 to 18 minutes with the 
> > > inefficient plan.
> > > This also happens even if the data is not significantly changed.
> > 
> > This is very likely owing to a bad estimate.
> > 
> > Could you turn on "track_io_timing" and send us the EXPLAIN (ANALYZE, 
> > BUFFERS) output
> > for both the good and the bad plan?
> 
> Thank you for your reply. Here are the two explains.
> In order to be able to publish the plans here, I have obfuscated the table 
> and field names, but this is reversible, so I can provide more info if needed.
> 
> plan-ok:
> https://explain.depesz.com/s/hQvM
> 
> plan-wrong:
> https://explain.depesz.com/s/uLvl

Thanks.

The difference in the plans is under the "Subquery Scan on odg", starting with
plan node 50 (everything under the "Sort").  I suspect that the mis-estimate
that is at the root of the problem is here:

->  Index Scan using table_k_late_spec_dp_end_dat_key on schema1.table_k kal  
(... rows=196053 ...) (... rows=471.00 ...)
      Index Cond: (kal.dp_end_dat < ('now'::cstring)::date)
      Index Searches: 1
      Buffers: shared hit=230 read=49
      I/O Timings: shared read=0.142

PostgreSQL overestimates the row count by a factor of over 400.
Try to fix that estimate and see if that gets PostgreSQL to do the right thing.

Perhaps a simple ANALYZE on the table can do the trick.

The right side of the comparison looks awkward, as if you wrote 
'now'::text::date
My experiments show that PostgreSQL v18 estimates well even with such a weird
condition, but perhaps if you write "current_date" instead, you'd get better 
results.

I'd play just with a query like

  EXPLAIN (ANALYZE)
  SELECT * FROM schema1.table_k AS kal
  WHERE dp_end_dat < current_date;

until I get a good estimate.

Yours,
Laurenz Albe


Reply via email to