>>>>> "Mark" == Mark Dilger <hornschnor...@gmail.com> writes:
Mark> Is there a performance test case where this patch should shine Mark> brightest? I'd like to load a schema with lots of data, and run Mark> a grouping sets query, both before and after applying the patch, Mark> to see what the performance advantage is. The area which I think is most important for performance is the handling of small cubes; without this patch, a 2d cube needs 2 full sorts, a 3d one needs 3, and a 4d one needs 6. In many real-world data sets these would all hash entirely in memory. So here's a very simple example (deliberately using integers for grouping to minimize the advantage; grouping by text columns in a non-C locale would show a much greater speedup for the patch): create table sales ( id serial, product_id integer, store_id integer, customer_id integer, qty integer); -- random integer function create function d(integer) returns integer language sql as $f$ select floor(random()*$1)::integer + 1; $f$; -- 10 million random rows insert into sales (product_id,store_id,customer_id,qty) select d(20), d(6), d(10), d(100) from generate_series(1,10000000); -- example 2d cube: select product_id, store_id, count(*), sum(qty) from sales group by cube(product_id, store_id); -- example 3d cube: select product_id, store_id, customer_id, count(*), sum(qty) from sales group by cube(product_id, store_id, customer_id); -- example 4d cube with a computed column: select product_id, store_id, customer_id, (qty / 10), count(*), sum(qty) from sales group by cube(product_id, store_id, customer_id, (qty / 10)); On my machine, the 2d cube is about 3.6 seconds with the patch, and about 8 seconds without it; the 4d is about 18 seconds with the patch and about 32 seconds without it (all with work_mem=1GB, compiled with -O2 and assertions off). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers