Hi Arnt & all
I digged into this a little further. Here are my oberservations:
The query in question contains a 2.8M =ANY(ARRAY[....]) clause, it takes
here about 10m to complete. Result Set is about 360k rows.
We are still on Postgresql 9.1. (Debian Wheezy), may be that is the
reason I reckoned. But after setting some postgresql.conf settings
(work_mem & friends, thx Axel) I still got bad query times.
Query looks like this:
"select mm.uid, mm.modseq, mm.message from mailbox_messages mm where
mm.mailbox=441 and not mm.deleted and mm.uid=any('{...}')".
I tried optimizing this by adding an index:
"martin_mailbox_deleted" btree (mailbox, uid) WHERE NOT deleted
not much improvement.
EXPLAIN ANALYZE still looked bad:
---
Index Scan using martin_mailbox on mailbox_messages mm
(cost=0.00..243326004.21 rows=386398 width=16) (actual
time=2.161..1050004.868 rows=397336 loops=1)
Index Cond: (mailbox = 441)
Filter: ((NOT deleted) AND (uid = ANY ('{16480,16481,... add your
2.8MB here}'))
Total runtime: 1050230.858 ms <- SEE THIS?
---
After an optimization I found mentioned here:
https://www.datadoghq.com/2013/08/100x-faster-postgres-performance-by-changing-1-line/,
I changed the query to look like this:
"select mm.uid, mm.modseq, mm.message from mailbox_messages mm where
mm.mailbox=441 and not mm.deleted and mm.uid=any(VALUES(16480),(16481)..)"
Apparently the VALUES trick builds a temp table which make the query a
lot easier to digest for Postgres.
I got much better times (236x faster!):
---
Nested Loop (cost=6621.38..6771.33 rows=194822 width=16) (actual
time=616.886..4312.135 rows=397336 loops=1)
-> HashAggregate (cost=6621.38..6623.38 rows=200 width=4) (actual
time=616.693..1010.668 rows=441425 loops=1)
-> Values Scan on "*VALUES*" (cost=0.00..5517.81 rows=441425
width=4) (actual time=0.006..214.002 rows=441425 loops=1)
-> Index Scan using martin_mailbox_deleted on mailbox_messages mm
(cost=0.00..0.69 rows=4 width=16) (actual time=0.006..0.006 rows=1
loops=441425)
Index Cond: ((mailbox = 441) AND (uid = "*VALUES*".column1))
Total runtime: 4446.153 ms
---
I think this easy optimization will do AOX overall good, I see similar
queries all the time at our server, averaging at about 2000ms.
A patch would be very much welcome :-)
--
Freundliche Grüße / Best Regards
Martin Rode
P.S.: And yes Arnt, Apple Mail on my iPad started that query ;-)