Re: [HACKERS] Parallel Aggregate

James Sewell Sun, 13 Mar 2016 18:53:21 -0700

Hi,

Happy to test, really looking forward to seeing this stuff in core.


The explain analyze is below:

Finalize HashAggregate  (cost=810142.42..810882.62 rows=59216 width=16)
(actual time=2282.092..2282.202 rows=15 loops=1)
   Group Key: (date_trunc('DAY'::text, pageview_start_tstamp))
   ->  Gather  (cost=765878.46..808069.86 rows=414512 width=16) (actual
time=2281.749..2282.060 rows=105 loops=1)
         Number of Workers: 6
         ->  Partial HashAggregate  (cost=764878.46..765618.66 rows=59216
width=16) (actual time=2276.879..2277.030 rows=15 loops=7)
               Group Key: date_trunc('DAY'::text, pageview_start_tstamp)
               ->  Parallel Seq Scan on celebrus_fact_agg_1_p2015_12
 (cost=0.00..743769.76 rows=4221741 width=12) (actual time=0.066..1631
.650 rows=3618887 loops=7)

One question - how is the upper limit of workers chosen?


James Sewell,
Solutions Architect
______________________________________


Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099


On Mon, Mar 14, 2016 at 12:30 PM, David Rowley <david.row...@2ndquadrant.com
> wrote:

> On 14 March 2016 at 14:16, James Sewell <james.sew...@lisasoft.com> wrote:
>
>> I've done some testing with one of my data sets in an 8VPU virtual
>> environment and this is looking really, really good.
>>
>> My test query is:
>>
>> SELECT pageview, sum(pageview_count)
>> FROM fact_agg_2015_12
>> GROUP BY date_trunc('DAY'::text, pageview);
>>
>> The query returns 15 rows. The fact_agg table is 5398MB and holds around
>> 25 million records.
>>
>> Explain with a max_parallel_degree of 8 tells me that the query will
>> only use 6 background workers. I have no indexes on the table currently.
>>
>> Finalize HashAggregate  (cost=810142.42..810882.62 rows=59216 width=16)
>>    Group Key: (date_trunc('DAY'::text, pageview))
>>    ->  Gather  (cost=765878.46..808069.86 rows=414512 width=16)
>>          Number of Workers: 6
>>          ->  Partial HashAggregate  (cost=764878.46..765618.66 rows=59216
>> width=16)
>>                Group Key: date_trunc('DAY'::text, pageview)
>>                ->  Parallel Seq Scan on fact_agg_2015_12
>>  (cost=0.00..743769.76 rows=4221741 width=12)
>>
>
> Great! Thanks for testing this.
>
> If you run EXPLAIN ANALYZE on this with the 6 workers, does the actual
> number of Gather rows come out at 105? I'd just like to get an idea of my
> cost estimate for the Gather are going to be accurate for real world data
> sets.
>
>
> --
>  David Rowley                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

-- 


------------------------------
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.

Re: [HACKERS] Parallel Aggregate

Reply via email to