On Friday, October 23, 2015, David Osborne <da...@qcode.co.uk> wrote:
> Hi, > > Wondering if anyone could suggest how we could improve the performance of > this type of query? > The intensive part is the summing of integer arrays as far as I can see. > We're thinking there's not much we can do to improve performance apart > from throw more CPU at it... would love to be proven wrong though! > > > *Query:* > > explain (analyse,buffers) > select > sum(s2.array_a),sum(s2.array_b) > from mytable s1 left join mytable s2 > on s1.code=s2.code and s1.buyer=s2.seller and s2.seller='XX' > where s1.buyer='XX' > group by s1.buyer,s1.code > ; > > > *Depesz Explain Link:* > > http://explain.depesz.com/s/m3XP > > > QUERY PLAN > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > GroupAggregate (cost=275573.49..336223.36 rows=2547 width=524) (actual > time=1059.340..22946.772 rows=22730 loops=1) > Buffers: shared hit=113596 read=1020 dirtied=15 > -> Merge Left Join (cost=275573.49..278850.09 rows=113560 width=524) > (actual time=1058.773..1728.186 rows=240979 loops=1) > Merge Cond: ((s1.code)::text = (s2.code)::text) > Join Filter: (s1.buyer = (s2.seller)::bpchar) > Buffers: shared hit=113596 read=1020 dirtied=15 > -> Index Only Scan using mytable_buyer_idx on mytable s1 > (cost=0.42..1226.06 rows=25465 width=12) (actual time=0.015..35.790 > rows=22730 loops=1) > Index Cond: (buyer = 'XX'::bpchar) > Heap Fetches: 3739 > Buffers: shared hit=16805 dirtied=1 > -> Sort (cost=275573.07..275818.33 rows=98106 width=525) > (actual time=1058.736..1141.560 rows=231662 loops=1) > Sort Key: s2.code > Sort Method: quicksort Memory: 241426kB > Buffers: shared hit=96791 read=1020 dirtied=14 > -> Bitmap Heap Scan on mytable s2 > (cost=12256.28..267439.07 rows=98106 width=525) (actual > time=60.330..325.730 rows=231662 loops=1) > Recheck Cond: ((seller)::text = 'XX'::text) > Filter: ((seller)::bpchar = 'XX'::bpchar) > Buffers: shared hit=96791 read=1020 dirtied=14 > -> Bitmap Index Scan on mytable_seller_idx > (cost=0.00..12231.75 rows=254844 width=0) (actual time=40.474..40.474 > rows=233244 loops=1) > Index Cond: ((seller)::text = 'XX'::text) > Buffers: shared hit=30 read=1020 > Total runtime: 22968.292 ms > (22 rows) > > > > *Table size:* > > => select count(*) from mytable; > count > -------- > 602669 > (1 row) > > > *Array types:* > > # select array_a,array_b from mytable limit 1; > array_a | array_b > ---------------------------+--------------------------- > {0,0,0,0,0,0,0,0,0,0,0,0} | {0,0,0,0,0,0,0,0,0,0,0,0} > > > *Example schema:* > > # \d mytable > Table "public.mytable" > Column | Type | Modifiers > -------------------+-----------------------+------------------------ > buyer | character(2) | not null > code | character varying(20) | not null > seller | character varying(50) | > array_a | integer[] | > array_b | integer[] | > Indexes: > "mytable_buyer_code_idx" UNIQUE, btree (buyer, code) CLUSTER > "mytable_buyer_idx" btree (buyer) > "mytable_code_idx" btree (code) > "mytable_seller_idx" btree (seller) > > > *Version:* > > > SELECT version() ; > version > > > -------------------------------------------------------------------------------------------------------------- > PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.6.3 > 20120306 (Red Hat 4.6.3-2), 64-bit > (1 row) > > This is running on an AWS RDS instance. > > Thanks for any pointers > -- > David > What's physical memory and setting of work_mem? merlin