> On 6/6/17 13:52, Regina Obe wrote:
>> It seems CREATE AGGREGATE was expanded in 9.6 to support
>> parallelization of aggregate functions using transitions, with the
>> addition of serialfunc and deserialfunc to the aggregate definitions.
>> I was looking at the PostgreSQL 10 source code for some example usages
>> of this and was hoping that array_agg and string_agg would support the
> I'm not sure how you would parallelize these, since in most uses you want to
> have a deterministic output order.
> Peter Eisentraut http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Good point. If that's the reason it wasn't done, that's good just wasn't sure.
But if you didn't have an ORDER BY in your aggregate usage, and you did have
those transition functions, it shouldn't be any different from any other use
I imagine you are right that most folks who use array_agg and string_agg
usually combine it with array_agg(... ORDER BY ..)
My main reason for asking is that most of the PostGIS geometry and raster
aggregate functions use transitions and were patterned after array agg.
In the case of PostGIS the sorting is done internally and really only to
expedite take advantage of things like cascaded union algorithms. That is
always done though (so even if each worker does it on just it's batch that's
still better than having only one worker).
So I think it's still very beneficial to break into separate jobs since in the
end the gather, will have say 2 biggish geometries or 2 biggish rasters to
union if you have 2 workers which is still better than having a million
smallish geometries/rasters to union
Parallel agg (internal sort geoms by box) - Union
Parallel Agg (internal sort geoms ) - Union
Gather Union(union, union) internal sort.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: