pretty slow merge-join due rescan?

Pavel Stehule Mon, 23 Aug 2021 09:45:23 -0700

Hi

The customer reports a very slow query. I have a reproducer script. The
dataset is not too big


postgres=# \dt+
                             List of relations
┌────────┬───────┬───────┬───────┬─────────────┬────────────┬─────────────┐
│ Schema │ Name  │ Type  │ Owner │ Persistence │    Size    │ Description │
╞════════╪═══════╪═══════╪═══════╪═════════════╪════════════╪═════════════╡
│ public │ f_dep │ table │ pavel │ permanent   │ 8192 bytes │             │
│ public │ f_emp │ table │ pavel │ permanent   │ 1001 MB    │             │
│ public │ f_fin │ table │ pavel │ permanent   │ 432 kB     │             │
│ public │ qt    │ table │ pavel │ permanent   │ 1976 kB    │             │
│ public │ qtd   │ table │ pavel │ permanent   │ 87 MB      │             │
└────────┴───────┴───────┴───────┴─────────────┴────────────┴─────────────┘
(5 rows)

and the query is not too complex

SELECT
     sub.a_121,
     count(*)
FROM (
        SELECT
              f_fin.dt_business_day_id AS a_1056,
              f_dep.description_id AS a_121,
              f_emp.employee_id_id AS a_1327
        FROM  f_emp
        INNER JOIN f_dep ON
                ( f_emp.department_id_id = f_dep.id )
        INNER JOIN f_fin ON
                ( f_emp.business_day_date_id = f_fin.id )
        GROUP BY 1, 2, 3
      ) AS sub
INNER JOIN qt ON
        ( sub.a_1056 = qt.tt_1056_1056_b )
LEFT OUTER JOIN qtd AS qt_2 ON
        ( ( qt.tt_1056_1056_b = qt_2.a_1056 )
        AND ( sub.a_121 = qt_2.a_121 )
        AND ( sub.a_1327 = qt_2.a_1327 ) )
LEFT OUTER JOIN qtd AS qt_3 ON
        ( ( qt.tt_1056_1056_a = qt_3.a_1056 )
        AND ( sub.a_121 = qt_3.a_121 )
        AND ( sub.a_1327 = qt_3.a_1327 ) )
GROUP BY 1;

By default I get a good plan, and the performance is ok
https://explain.depesz.com/s/Mr2H (about 16 sec). Unfortunately, when I
increase work_mem, I get good plan with good performance
https://explain.depesz.com/s/u4Ff

But this depends on index only scan. In the production environment, the
index only scan is not always available, and I see another plan (I can get
this plan with disabled index only scan).

Although the cost is almost the same, the query is about 15x slower
https://explain.depesz.com/s/L6zP

│ HashAggregate  (cost=1556129.74..1556131.74 rows=200 width=12) (actual
time=269948.878..269948.897 rows=64 loops=1)
                │
│   Group Key: f_dep.description_id

            │
│   Batches: 1  Memory Usage: 40kB

             │
│   Buffers: shared hit=5612 read=145051

             │
│   ->  Merge Left Join  (cost=1267976.96..1534602.18 rows=4305512 width=4)
(actual time=13699.847..268785.500 rows=4291151 loops=1)
             │
│         Merge Cond: ((f_emp.employee_id_id = qt_3.a_1327) AND
(f_dep.description_id = qt_3.a_121))
                         │
│         Join Filter: (qt.tt_1056_1056_a = qt_3.a_1056)

             │
│         Rows Removed by Join Filter: 1203659495

            │
│         Buffers: shared hit=5612 read=145051

             │

            .....

                      │
│         ->  Sort  (cost=209977.63..214349.77 rows=1748859 width=12) (actual
time=979.522..81842.913 rows=1205261892 loops=1)
           │
│               Sort Key: qt_3.a_1327, qt_3.a_121

            │
│               Sort Method: quicksort  Memory: 144793kB

             │
│               Buffers: shared hit=2432 read=8718

             │
│               ->  Seq Scan on qtd qt_3  (cost=0.00..28638.59 rows=1748859
width=12) (actual time=0.031..284.437 rows=1748859 loops=1)
│                     Buffers: shared hit=2432 read=8718

The sort of qtd table is very fast

postgres=# explain analyze select * from qtd order by a_1327, a_121;
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN
                                           │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort  (cost=209977.63..214349.77 rows=1748859 width=27) (actual
time=863.923..1111.213 rows=1748859 loops=1)         │
│   Sort Key: a_1327, a_121
                                           │
│   Sort Method: quicksort  Memory: 199444kB
                                            │
│   ->  Seq Scan on qtd  (cost=0.00..28638.59 rows=1748859 width=27)
(actual time=0.035..169.385 rows=1748859 loops=1) │
│ Planning Time: 0.473 ms
                                           │
│ Execution Time: 1226.305 ms
                                           │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(6 rows)

but here it returns 700x lines more and it is 70 x slower. Probably it is
because something does rescan. But why? With index only scan, I don't see
any indices of rescan.

Is it an executor or optimizer bug? Or is it a bug? I tested this behaviour
on Postgres 13 and on the fresh master branch.

Regards

Pavel

pretty slow merge-join due rescan?

Reply via email to