2014-08-27 21:41 GMT+02:00 Merlin Moncure <mmonc...@gmail.com>:

> On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.steh...@gmail.com>
> wrote:
> > Hi
> >
> > one user asked about using a partitioning for faster aggregates queries.
> >
> > I found so there is not any optimization.
> >
> > create table x1(a int, d date);
> > create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
> > create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
> > create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);
> >
> > When I have this schema, then optimizer try to do
> >
> > postgres=# explain verbose select max(a) from x1 group by d order by d;
> >                                    QUERY PLAN
> >
> --------------------------------------------------------------------------------
> >  GroupAggregate  (cost=684.79..750.99 rows=200 width=8)
> >    Output: max(x1.a), x1.d
> >    Group Key: x1.d
> >    ->  Sort  (cost=684.79..706.19 rows=8561 width=8)
> >          Output: x1.d, x1.a
> >          Sort Key: x1.d
> >          ->  Append  (cost=0.00..125.60 rows=8561 width=8)
> >                ->  Seq Scan on public.x1  (cost=0.00..0.00 rows=1
> width=8)
> >                      Output: x1.d, x1.a
> >                ->  Seq Scan on public.x_1  (cost=0.00..31.40 rows=2140
> > width=8)
> >                      Output: x_1.d, x_1.a
> >                ->  Seq Scan on public.x_2  (cost=0.00..31.40 rows=2140
> > width=8)
> >                      Output: x_2.d, x_2.a
> >                ->  Seq Scan on public.x_3  (cost=0.00..31.40 rows=2140
> > width=8)
> >                      Output: x_3.d, x_3.a
> >                ->  Seq Scan on public.x_4  (cost=0.00..31.40 rows=2140
> > width=8)
> >                      Output: x_4.d, x_4.a
> >  Planning time: 0.333 ms
> >
> > It can be reduced to:
> >
> >   sort by d
> >   Append
> >        Aggegate (a), d
> >          seq scan from x_1
> >        Aggregate (a), d
> >          seq scan from x_2
> >
> > Are there some plans to use partitioning for aggregation?
>
> Besides min/max, what other aggregates (mean/stddev come to mind)
> would you optimize and how would you determine which ones could be?
> Where is that decision made?
>

I am thinking so all aggregates are possible

when you have a partitions by column X -- then you have a natural sets by X,

so you can directly calculate any aggregates on any column when GROUP BY
clause is a "GROUP BY X"

isn't it?

probably some similar optimizations are possible when you have "GROUP BY
X,Y" -- minimally you have more sets, and you can do aggregations on
smaller sets.

Pavel


>
> For example, could user defined aggregates be pushed down if you had a
> reaggregation routine broken out from the main one?
>
> merlin
>

Reply via email to