Re: [HACKERS] Possible regression with gather merge.

Mithun Cy Wed, 22 Mar 2017 06:43:56 -0700

On Wed, Mar 22, 2017 at 1:09 PM, Rushabh Lathia
<rushabh.lat...@gmail.com> wrote:
> In the code, gather merge consider the limit for the sort into
> create_ordered_paths,
> which is wrong. Into create_ordered_paths(), GM should not consider the
> limit
> while doing costing for the sort node.
>
> Attached patch fix the bug.
Thanks, Rushabh, Now I see non-parallel scan being picked up by default


create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,10000000) id);

postgres=# explain analyze select * from test order by v1, v2 limit 10;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=370146.98..370147.00 rows=10 width=16) (actual
time=1169.685..1169.686 rows=10 loops=1)
   ->  Sort  (cost=370146.98..395146.63 rows=9999860 width=16) (actual
time=1169.683..1169.684 rows=10 loops=1)
         Sort Key: v1, v2
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Seq Scan on test  (cost=0.00..154053.60 rows=9999860
width=16) (actual time=0.016..590.176 rows=10000000 loops=1)
 Planning time: 0.070 ms
 Execution time: 1169.706 ms
(7 rows)


Another find by accident. Setting higher
max_parallel_workers_per_gather to a higher value than default results
in more parallel workers, But query runs slower than the plan with
lesser workers.

postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
                                                                 QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=697263.86..1669540.28 rows=8333216 width=16) (actual
time=2212.437..8207.161 rows=10000000 loops=1)
   ->  Gather Merge  (cost=697263.86..1669540.28 rows=8333216
width=16) (actual time=2212.436..7600.478 rows=10000000 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=696263.84..706680.36 rows=4166608 width=16)
(actual time=2173.756..3105.512 rows=3333333 loops=3)
               Sort Key: v1, v2
               Sort Method: external merge  Disk: 86648kB
               ->  Parallel Seq Scan on test  (cost=0.00..95721.08
rows=4166608 width=16) (actual time=0.030..240.486 rows=3333333
loops=3)
 Planning time: 0.096 ms
 Execution time: 8537.214 ms
(10 rows)

postgres=# set max_parallel_workers_per_gather = 10;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
                                                                 QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=431168.44..1628498.09 rows=9999860 width=16) (actual
time=1525.120..13273.805 rows=10000000 loops=1)
   ->  Gather Merge  (cost=431168.44..1628498.09 rows=9999860
width=16) (actual time=1525.119..12650.621 rows=10000000 loops=1)
         Workers Planned: 4
         Workers Launched: 4
         ->  Sort  (cost=430168.39..436418.30 rows=2499965 width=16)
(actual time=1472.799..2133.571 rows=2000000 loops=5)
               Sort Key: v1, v2
               Sort Method: external merge  Disk: 50336kB
               ->  Parallel Seq Scan on test  (cost=0.00..79054.65
rows=2499965 width=16) (actual time=0.047..201.405 rows=2000000
loops=5)
 Planning time: 0.077 ms
 Execution time: 13622.319 ms
(10 rows)



-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible regression with gather merge.

Reply via email to