On Tue, 31 Mar 2026 at 22:29, Scott Carey <[email protected]> wrote:
> A simple aggregate using array_agg goes significantly faster and faster the 
> smaller the (work_mem * hash_mem_multiplier), with the same simple query 
> plan:  HashAggregate over a sequential scan.   Changing to a simple 
> aggregate, such as max() does not have this behavior and is always fast.  
> Switching to another aggregate that grows in size for each element, such as 
> json_agg or string_agg also does not have this behavior.  If I add an order 
> by clause inside array_agg, performance significantly improves as it changes 
> from a HashAggregate of a sequential scan to  a GroupAggregate over a sort 
> over a sequential scan.  Something seems specifically broken with array_agg + 
> HashAggregate.
>
> These queries are anywhere from 10x to 1000x slower on Postgres 17.9 than 
> they were on Postgres 12.19 on production data.  Some of our OLTP queries  
> have gone from minutes to 6 hours to complete.   I do not know if this 
> happens on Postgres 18,  I can confirm it also happens on Postgres 16.8.  I 
> do not know about 13 through 15.

I tried and failed to recreate this locally on 17.9. For me the
json_agg query is slower than array_agg(). I tried making the table
10x bigger and still don't see the same issue. The one with more
work_mem and fewer batches is always faster for me.

Is the machine under a lot of memory pressure and swapping pages to
disk? Maybe you need to consider running a lower work_mem setting. How
much RAM is installed in this machine?

> set hash_mem_multiplier = 2;
> set work_mem = "100MB";
>
> explain (analyze, buffers) select product_id, array_agg(region_id) from 
> array_agg_test group by product_id;
>                                                         QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
> HashAggregate  (cost=4274.00..4771.49 rows=49749 width=40) (actual 
> time=4628.278..4643.765 rows=50000 loops=1)
>   Group Key: product_id
>   Batches: 1  Memory Usage: 55649kB
>   Buffers: shared hit=1274
>   ->  Seq Scan on array_agg_test  (cost=0.00..3274.00 rows=200000 width=16) 
> (actual time=0.030..16.694 rows=200000 loops=1)
>         Buffers: shared hit=1274
> Planning Time: 0.067 ms
> Execution Time: 4648.698 ms

> Below, note json_agg does not have this problem:
>
> set hash_mem_multiplier = 2;
> set work_mem = "500MB";
>
> explain (analyze, buffers) select product_id, json_agg(region_id) from 
> array_agg_test group by product_id;
>                                                         QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------
> HashAggregate  (cost=4274.00..4895.86 rows=49749 width=40) (actual 
> time=110.975..122.781 rows=50000 loops=1)
>   Group Key: product_id
>   Batches: 1  Memory Usage: 67089kB
>   Buffers: shared read=1274
>   ->  Seq Scan on array_agg_test  (cost=0.00..3274.00 rows=200000 width=16) 
> (actual time=0.034..17.659 rows=200000 loops=1)
>         Buffers: shared read=1274
> Planning:
>   Buffers: shared hit=10
> Planning Time: 0.054 ms
> Execution Time: 124.929 ms

What changed here apart from the aggregate function?  Why are the
buffers being read on this run and not the previous? Same machine? Was
there a restart?

json_agg allocates slightly more memory per agg state than array_agg.
You can see that in the reported Hash Aggregate memory usage and I
expect the actual transition function call between array_agg() and
json_agg() not to differ very much in cost, so it very much feels like
something else is going on here.

David


Reply via email to