2014-08-27 21:41 GMT+02:00 Merlin Moncure <mmonc...@gmail.com>: > On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > > Hi > > > > one user asked about using a partitioning for faster aggregates queries. > > > > I found so there is not any optimization. > > > > create table x1(a int, d date); > > create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1); > > create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1); > > create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1); > > > > When I have this schema, then optimizer try to do > > > > postgres=# explain verbose select max(a) from x1 group by d order by d; > > QUERY PLAN > > > -------------------------------------------------------------------------------- > > GroupAggregate (cost=684.79..750.99 rows=200 width=8) > > Output: max(x1.a), x1.d > > Group Key: x1.d > > -> Sort (cost=684.79..706.19 rows=8561 width=8) > > Output: x1.d, x1.a > > Sort Key: x1.d > > -> Append (cost=0.00..125.60 rows=8561 width=8) > > -> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 > width=8) > > Output: x1.d, x1.a > > -> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140 > > width=8) > > Output: x_1.d, x_1.a > > -> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140 > > width=8) > > Output: x_2.d, x_2.a > > -> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140 > > width=8) > > Output: x_3.d, x_3.a > > -> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140 > > width=8) > > Output: x_4.d, x_4.a > > Planning time: 0.333 ms > > > > It can be reduced to: > > > > sort by d > > Append > > Aggegate (a), d > > seq scan from x_1 > > Aggregate (a), d > > seq scan from x_2 > > > > Are there some plans to use partitioning for aggregation? > > Besides min/max, what other aggregates (mean/stddev come to mind) > would you optimize and how would you determine which ones could be? > Where is that decision made? >
I am thinking so all aggregates are possible when you have a partitions by column X -- then you have a natural sets by X, so you can directly calculate any aggregates on any column when GROUP BY clause is a "GROUP BY X" isn't it? probably some similar optimizations are possible when you have "GROUP BY X,Y" -- minimally you have more sets, and you can do aggregations on smaller sets. Pavel > > For example, could user defined aggregates be pushed down if you had a > reaggregation routine broken out from the main one? > > merlin >