Re: Extremely slow HashAggregate in simple UNION query

Jeff Janes Wed, 21 Aug 2019 18:12:25 -0700

On Tue, Aug 20, 2019 at 11:12 AM Felix Geisendörfer <[email protected]>
wrote:
 ...



> [1] My actual query had bad estimates for other reasons (GIN Index), but
> that's another story. The query above was of course deliberately designed
> to have bad estimates.
>

As noted elsewhere, v12 thwarts your attempts to deliberately design the
bad estimates.  You can still get them, you just have to work a bit harder
at it:

CREATE FUNCTION j (bigint, bigint) returns setof bigint as $$ select
generate_series($1,$2) $$ rows 1000 language sql;

EXPLAIN ANALYZE
SELECT * FROM j(1, 1) a, j(1, 1) b
UNION
SELECT * FROM j(1, 1) a, j(1, 1) b;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=80021.00..100021.00 rows=2000000 width=16) (actual
time=11.332..13.241 rows=1 loops=1)
   Group Key: a.a, b.b
   ->  Append  (cost=0.50..70021.00 rows=2000000 width=16) (actual
time=0.118..0.163 rows=2 loops=1)
         ->  Nested Loop  (cost=0.50..20010.50 rows=1000000 width=16)
(actual time=0.117..0.118 rows=1 loops=1)
               ->  Function Scan on j a  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.087..0.088 rows=1 loops=1)
               ->  Function Scan on j b  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.027..0.027 rows=1 loops=1)
         ->  Nested Loop  (cost=0.50..20010.50 rows=1000000 width=16)
(actual time=0.044..0.044 rows=1 loops=1)
               ->  Function Scan on j a_1  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.022..0.022 rows=1 loops=1)
               ->  Function Scan on j b_1  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.020..0.021 rows=1 loops=1)
 Planning Time: 0.085 ms
 Execution Time: 69.277 ms
(11 rows)

But the same advance in v12 which makes it harder to fool with your test
case also opens the possibility of fixing your real case.

I've made an extension which has a function which always returns true, but
lies about how often it is expected to return true. See the attachment.
With that, you can fine-tune the planner.

CREATE EXTENSION pg_selectivities ;

EXPLAIN ANALYZE
SELECT * FROM j(1, 1) a, j(1, 1) b where pg_always(0.00001)
UNION
SELECT * FROM j(1, 1) a, j(1, 1) b where pg_always(0.00001);
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=45021.40..45021.60 rows=20 width=16) (actual
time=0.226..0.227 rows=1 loops=1)
   Group Key: a.a, b.b
   ->  Append  (cost=0.50..45021.30 rows=20 width=16) (actual
time=0.105..0.220 rows=2 loops=1)
         ->  Nested Loop  (cost=0.50..22510.50 rows=10 width=16) (actual
time=0.104..0.105 rows=1 loops=1)
               Join Filter: pg_always('1e-05'::double precision)
               ->  Function Scan on j a  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.066..0.066 rows=1 loops=1)
               ->  Function Scan on j b  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.035..0.035 rows=1 loops=1)
         ->  Nested Loop  (cost=0.50..22510.50 rows=10 width=16) (actual
time=0.112..0.113 rows=1 loops=1)
               Join Filter: pg_always('1e-05'::double precision)
               ->  Function Scan on j a_1  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.077..0.077 rows=1 loops=1)
               ->  Function Scan on j b_1  (cost=0.25..10.25 rows=1000
width=8) (actual time=0.034..0.034 rows=1 loops=1)
 Planning Time: 0.139 ms
 Execution Time: 0.281 ms

Cheers,

Jeff

pg_selectivities.patch
Description: Binary data

Re: Extremely slow HashAggregate in simple UNION query

Reply via email to