On Tue, Mar 27, 2018 at 12:28 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> David Rowley <david.row...@2ndquadrant.com> writes:
> > On 27 March 2018 at 09:27, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> I do not think it is accidental that these aggregates are exactly the
> ones
> >> that do not have parallelism support today.  Rather, that's because you
> >> just about always have an interest in the order in which the inputs get
> >> aggregated, which is something that parallel aggregation cannot support.
>
> > This very much reminds me of something that exists in the 8.4 release
> notes:
> >> SELECT DISTINCT and UNION/INTERSECT/EXCEPT no longer always produce
> sorted output (Tom)
>
> That's a completely false analogy: we got a significant performance
> benefit for a significant fraction of users by supporting hashed
> aggregation.  My argument here is that only a very tiny fraction of
> string_agg/array_agg users will not care about aggregation order, and thus
> I don't believe that this patch can help very many people.  Against that,
> it's likely to hurt other people, by breaking their queries and forcing
> them to insert expensive explicit sorts to fix it.  Even discounting the
> backwards-compatibility question, we don't normally adopt performance
> features for which it's unclear that the net gain over all users is
> positive.
>

I think you are quite wrong in claiming that only a tiny fraction of the
users are going to care.

This may, and quite probably does, hold true for string_agg(), but not for
array_agg(). I see a lot of cases where people use that to load it into an
unordered array/hashmap/set/whatever on the client side, which looses
ordering *anyway*,and they would definitely benefit from it.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Reply via email to