Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

Tomas Vondra Fri, 20 Feb 2015 12:58:17 -0800

On 25.1.2015 12:15, Andrew Gierth wrote:
>
> So given some suitable test data, such as
> 
> create table stuff as select random()::text as randtext
>   from generate_series(1,1000000);  -- or however many rows
> 
> you can do
> 
> select percentile_disc(0) within group (order by randtext) from stuff;
> 
> or
> 
> select count(distinct randtext) from stuff;
> 
> The performance improvements I saw were pretty much exactly as
> expected from the improvement in the ORDER BY and CREATE INDEX cases.


I've spent a fair amount of testing this today, and when using the
simple percentile_disc example mentioned above, I see this pattern:

                                 master   patched   speedup
   ---------------------------------------------------------
    generate_series(1,1000000)      4.2       0.7      6
    generate_series(1,2000000)      9.2       9.8      0.93
    generate_series(1,3000000)     14.5      15.3      0.95


so for a small dataset the speedup is very nice, but for larger sets
there's ~5% slowdown. Is this expected?


-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Abbreviated keys for Datum tuplesort

Reply via email to