I am trying to optimize a query that does a lot of aggregation. I have a large number of columns that are part of the result, and most are aggregates. They are acting on two temporary tables, the largest of which should have at most 1 million tuples, and the smaller around 5000; the the smaller table matches the number of rows expecting in the result. I've played around with some indexes on the temp tables, and analyzing them; even using a vacuum analyze, and the worst part is always a groupAggregate.
This query can be optimized at the expense of other operations; it will be run during low usage hours. I have tried to bump up sort_mem to get the query optimizer to cosider a HashAggregate instread of a groupAggregate; setting it as high as 2 gigs still had the query optimizer using GroupAggregate. The troublesome query is: select tempItems.category_id, date('2003-11-22'), sum(a) as a, count(*) as b, sum(case when type = 1 then 0 else someNumber end) as successful, sum(c) as c, ........ ........ tempAggregates.mode as mode -variations of the above repeated around 30 times, with a few other aggregates like min and max making an appearance, and some array stuff from tempItems join tempAggregates using (category_id) group by tempItems.category_id, mode I've tried just grouping by category_id, and doing a max(mode), but that doesn't have much of an effect on performance; although row estimation for the group aggregate was better. A lot is being done, so maybe I can't get it to be much more efficient... Here's the output of an explain analyze: QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- --- GroupAggregate (cost=0.00..338300.34 rows=884 width=345) (actual time=86943.272..382718.104 rows=3117 loops=1) -> Merge Join (cost=0.00..93642.52 rows=1135610 width=345) (actual time=0.148..24006.748 rows=1120974 loops=1) Merge Cond: ("outer".category_id = "inner".category_id) -> Index Scan using tempaggregatesindex on tempaggregates (cost=0.00..91.31 rows=3119 width=115) (actual time=0.055..6.573 rows=3117 loops=1) -> Index Scan using tempitemsindex on tempitems (cost=0.00..79348.45 rows=1135610 width=241) (actual time=0.064..7511.980 rows=1121164 loops=1) Total runtime: 382725.502 ms (6 rows) Any thoughts or suggestions would be appreciated. -Adam Palmblad ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings