Hi Arnt & all

I digged into this a little further. Here are my oberservations:

The query in question contains a 2.8M =ANY(ARRAY[....]) clause, it takes here about 10m to complete. Result Set is about 360k rows.

We are still on Postgresql 9.1. (Debian Wheezy), may be that is the reason I reckoned. But after setting some postgresql.conf settings (work_mem & friends, thx Axel) I still got bad query times.

Query looks like this:

"select mm.uid, mm.modseq, mm.message from mailbox_messages mm where mm.mailbox=441 and not mm.deleted and mm.uid=any('{...}')".

I tried optimizing this by adding an index:
    "martin_mailbox_deleted" btree (mailbox, uid) WHERE NOT deleted
not much improvement.

EXPLAIN ANALYZE still looked bad:

---
Index Scan using martin_mailbox on mailbox_messages mm (cost=0.00..243326004.21 rows=386398 width=16) (actual time=2.161..1050004.868 rows=397336 loops=1)
   Index Cond: (mailbox = 441)
Filter: ((NOT deleted) AND (uid = ANY ('{16480,16481,... add your 2.8MB here}'))
 Total runtime: 1050230.858 ms  <- SEE THIS?
---

After an optimization I found mentioned here: https://www.datadoghq.com/2013/08/100x-faster-postgres-performance-by-changing-1-line/, I changed the query to look like this:

"select mm.uid, mm.modseq, mm.message from mailbox_messages mm where mm.mailbox=441 and not mm.deleted and mm.uid=any(VALUES(16480),(16481)..)"

Apparently the VALUES trick builds a temp table which make the query a lot easier to digest for Postgres.

I got much better times (236x faster!):

---
Nested Loop (cost=6621.38..6771.33 rows=194822 width=16) (actual time=616.886..4312.135 rows=397336 loops=1) -> HashAggregate (cost=6621.38..6623.38 rows=200 width=4) (actual time=616.693..1010.668 rows=441425 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..5517.81 rows=441425 width=4) (actual time=0.006..214.002 rows=441425 loops=1) -> Index Scan using martin_mailbox_deleted on mailbox_messages mm (cost=0.00..0.69 rows=4 width=16) (actual time=0.006..0.006 rows=1 loops=441425)
         Index Cond: ((mailbox = 441) AND (uid = "*VALUES*".column1))
 Total runtime: 4446.153 ms
---

I think this easy optimization will do AOX overall good, I see similar queries all the time at our server, averaging at about 2000ms.

A patch would be very much welcome :-)


--
Freundliche Grüße / Best Regards

Martin Rode


P.S.: And yes Arnt, Apple Mail on my iPad started that query ;-)

Reply via email to