On 12 October 2015 at 15:07, Haribabu Kommi <kommi.harib...@gmail.com>
> Parallel aggregate is the feature doing the aggregation job parallel
> with the help of Gather and
> partial seq scan nodes. The following is the basic overview of the
> parallel aggregate changes.
> Decision phase:
> Based on the following conditions, the parallel aggregate plan is
> - check whether the below plan node is Gather + partial seq scan only.
> This is because to check whether the plan nodes that are present are
> aware of parallelism or not?
> - check Are there any projection or qual condition is present in the
> Gather node?
> If there exists any quals and projection info that is required to
> performed in the
> Gather node because of the function that can only be executed in
> master backends,
> the parallel aggregate plan is not chosen.
> - check whether the aggregate supports parallelism or not.
> As for first patch, I thought of supporting only some aggregates for
> this parallel aggregate.
> The supported aggregates are mainly the aggregate functions that have
> variable length data types as final and transition types. This is to
> avoid changing the target list return types. Because of variable
> lengths, even the transition type can be returned to backend without
> applying the final function in aggregate. To identify the supported
> aggregates for parallelism, a new member is added to pg_aggregate
> system catalog table.
> - currently Group and plain aggregates are only supported for simplicity.
> This patch doesn't change anything in aggregate plan decision. If the
> planner decides the group
> or plain aggregates as the best plan, then we will check whether this
> can be converted into
> parallel aggregate or not?
I've never previously proposed any implementation for parallel aggregation,
but I have previously proposed infrastructure to allow aggregation to
happen in multiple steps. It seems your plan sounds very different from
what I've proposed.
I attempted to convey my idea on this to the community here
which Simon and I proposed an actual proof of concept patch here
I've since expanded on that work in the form of a WIP patch which
implements GROUP BY before JOIN here
It's pretty evident that we both need to align the way we plan to handle
this multiple step aggregation, there's no sense at all in having 2
different ways of doing this. Perhaps you could look over my patch and let
me know the parts which you disagree with, then we can resolve these
together and come up with the best solution for each of us.
It may also be useful for you to glance at how Postgres-XL handles this
partial aggregation problem, as it, where possible, will partially
aggregate the results on each node, pass the partially aggregates state to
the master node to have it perform the final aggregate stage on each of the
individual aggregate states from each node. Note that this requires giving
the aggregates with internal aggregate states an SQL level type and it also
means implementing an input and output function for these types. I've
noticed that XL mostly handles this by making the output function build a
string something along the lines of <count>:<sum> for aggregates such as
AVG(). I believe you'll need something very similar to this to pass the
partial states between worker and master process.
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services