Hi, Wondering if anyone could suggest how we could improve the performance of this type of query? The intensive part is the summing of integer arrays as far as I can see. We're thinking there's not much we can do to improve performance apart from throw more CPU at it... would love to be proven wrong though!
*Query:* explain (analyse,buffers) select sum(s2.array_a),sum(s2.array_b) from mytable s1 left join mytable s2 on s1.code=s2.code and s1.buyer=s2.seller and s2.seller='XX' where s1.buyer='XX' group by s1.buyer,s1.code ; *Depesz Explain Link:* http://explain.depesz.com/s/m3XP QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=275573.49..336223.36 rows=2547 width=524) (actual time=1059.340..22946.772 rows=22730 loops=1) Buffers: shared hit=113596 read=1020 dirtied=15 -> Merge Left Join (cost=275573.49..278850.09 rows=113560 width=524) (actual time=1058.773..1728.186 rows=240979 loops=1) Merge Cond: ((s1.code)::text = (s2.code)::text) Join Filter: (s1.buyer = (s2.seller)::bpchar) Buffers: shared hit=113596 read=1020 dirtied=15 -> Index Only Scan using mytable_buyer_idx on mytable s1 (cost=0.42..1226.06 rows=25465 width=12) (actual time=0.015..35.790 rows=22730 loops=1) Index Cond: (buyer = 'XX'::bpchar) Heap Fetches: 3739 Buffers: shared hit=16805 dirtied=1 -> Sort (cost=275573.07..275818.33 rows=98106 width=525) (actual time=1058.736..1141.560 rows=231662 loops=1) Sort Key: s2.code Sort Method: quicksort Memory: 241426kB Buffers: shared hit=96791 read=1020 dirtied=14 -> Bitmap Heap Scan on mytable s2 (cost=12256.28..267439.07 rows=98106 width=525) (actual time=60.330..325.730 rows=231662 loops=1) Recheck Cond: ((seller)::text = 'XX'::text) Filter: ((seller)::bpchar = 'XX'::bpchar) Buffers: shared hit=96791 read=1020 dirtied=14 -> Bitmap Index Scan on mytable_seller_idx (cost=0.00..12231.75 rows=254844 width=0) (actual time=40.474..40.474 rows=233244 loops=1) Index Cond: ((seller)::text = 'XX'::text) Buffers: shared hit=30 read=1020 Total runtime: 22968.292 ms (22 rows) *Table size:* => select count(*) from mytable; count -------- 602669 (1 row) *Array types:* # select array_a,array_b from mytable limit 1; array_a | array_b ---------------------------+--------------------------- {0,0,0,0,0,0,0,0,0,0,0,0} | {0,0,0,0,0,0,0,0,0,0,0,0} *Example schema:* # \d mytable Table "public.mytable" Column | Type | Modifiers -------------------+-----------------------+------------------------ buyer | character(2) | not null code | character varying(20) | not null seller | character varying(50) | array_a | integer[] | array_b | integer[] | Indexes: "mytable_buyer_code_idx" UNIQUE, btree (buyer, code) CLUSTER "mytable_buyer_idx" btree (buyer) "mytable_code_idx" btree (code) "mytable_seller_idx" btree (seller) *Version:* > SELECT version() ; version -------------------------------------------------------------------------------------------------------------- PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2), 64-bit (1 row) This is running on an AWS RDS instance. Thanks for any pointers -- David