Hello all,
I've been running DBMail for years now and things have gone great. Recently, however, my server was downgraded due to hardware failure and has just never been the same (went from 750GB RAID5 to 250GB RAID1, 4GB RAM to 1GB RAM, two HT processors to one Core2Duo). Obviously that nailed performance so I've been struggling ever since.

A few days ago I got a new server that's a little more buff and performance was significantly better. I moved from some dev subversion snapshot of 2.2 to the latest stable snapshot (2.2.6-rc1) and decided to stop following the subversion releases for a while.

  And now, the meat of the post...

Performance, while better, is still nowhere near as good as I remember it being. I've only got two users on this DBMail installation so I can't understand why things are so slow. I noticed that the following query was running pretty much all the time (with different values in the SQL, of course):

SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id JOIN dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr = 994 AND status IN (0,1) AND headername ILIKE 'MESSAGE-ID' AND headervalue ILIKE '% <[EMAIL PROTECTED]>%' ORDER BY message_idnr;

  This gives me the execution plan of:

Sort (cost=45712.28..45712.29 rows=1 width=8) (actual time=2398.829..2398.829 rows=0 loops=1)
   Sort Key: m.message_idnr
-> Nested Loop (cost=0.00..45712.27 rows=1 width=8) (actual time=2398.773..2398.773 rows=0 loops=1) -> Nested Loop (cost=0.00..45709.24 rows=1 width=16) (actual time=2398.770..2398.770 rows=0 loops=1) -> Nested Loop (cost=0.00..45704.19 rows=1 width=24) (actual time=586.803..2398.698 rows=1 loops=1) -> Seq Scan on dbmail_headervalue v (cost=0.00..45698.28 rows=1 width=16) (actual time=586.729..2398.620 rows=1 loops=1) Filter: (headervalue ~~* '% <[EMAIL PROTECTED]>%'::text) -> Index Scan using dbmail_physmessage_pkey on dbmail_physmessage p (cost=0.00..5.91 rows=1 width=8) (actual time=0.053..0.055 rows=1 loops=1)
                           Index Cond: ("outer".physmessage_id = p.id)
-> Index Scan using dbmail_messages_2 on dbmail_messages m (cost=0.00..5.04 rows=1 width=16) (actual time=0.068..0.068 rows=0 loops=1) Index Cond: ("outer".physmessage_id = m.physmessage_id) Filter: ((mailbox_idnr = 994) AND ((status = 0) OR (status = 1))) -> Index Scan using dbmail_headername_pkey on dbmail_headername n (cost=0.00..3.02 rows=1 width=8) (never executed)
               Index Cond: ("outer".headername_id = n.id)
               Filter: ((headername)::text ~~* 'MESSAGE-ID'::text)
Total runtime: 2398.974 ms

That's quite a long runtime so I started looking into the database and noticed the following things:

1) There is no index on dbmail_headervalue(headervalue)
2) dbmail_headervalue(headervalue) is, as far as I can tell, exactly equal to the search string 3) dbmail_headername(headername) is, as far as I can tell, exactly equal to the search string

  I added functional indexes for headervalue and headername like this:

create index dbmail_headervalue_3 on dbmail_headervalue(lower (headervalue)); create index dbmail_headername_2 on dbmail_headername(lower (headername));

  And rewrote the query like this:

SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id JOIN dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr = 994 AND status IN (0,1) AND lower(headername) = lower('MESSAGE-ID') AND lower(headervalue) = lower ('<[EMAIL PROTECTED]>') ORDER BY message_idnr;

  Which gives me an execution plan like this:

Sort (cost=25607.35..25607.35 rows=1 width=8) (actual time=0.406..0.406 rows=0 loops=1)
   Sort Key: m.message_idnr
-> Nested Loop (cost=22.86..25607.34 rows=1 width=8) (actual time=0.400..0.400 rows=0 loops=1) -> Nested Loop (cost=22.86..25601.42 rows=1 width=24) (actual time=0.398..0.398 rows=0 loops=1) -> Hash Join (cost=22.86..25414.56 rows=37 width=8) (actual time=0.331..0.334 rows=1 loops=1)
                     Hash Cond: ("outer".headername_id = "inner".id)
-> Index Scan using dbmail_headervalue_3 on dbmail_headervalue v (cost=0.00..25358.81 rows=6506 width=16) (actual time=0.051..0.053 rows=1 loops=1) Index Cond: (lower(headervalue) = '<[EMAIL PROTECTED]>'::text) -> Hash (cost=22.84..22.84 rows=7 width=8) (actual time=0.058..0.058 rows=0 loops=1) -> Index Scan using dbmail_headername_2 on dbmail_headername n (cost=0.00..22.84 rows=7 width=8) (actual time=0.042..0.053 rows=4 loops=1) Index Cond: (lower ((headername)::text) = 'message-id'::text) -> Index Scan using dbmail_messages_2 on dbmail_messages m (cost=0.00..5.04 rows=1 width=16) (actual time=0.060..0.060 rows=0 loops=1) Index Cond: ("outer".physmessage_id = m.physmessage_id) Filter: ((mailbox_idnr = 994) AND ((status = 0) OR (status = 1))) -> Index Scan using dbmail_physmessage_pkey on dbmail_physmessage p (cost=0.00..5.91 rows=1 width=8) (never executed)
               Index Cond: ("outer".physmessage_id = p.id)
Total runtime: 0.511 ms

The performance increase is incredible. It cuts the computed cost approximately in half but cuts the actual execution time by several orders of magnitude due to:

1) Exact comparisons always using the index (ILIKE and strings starting with % are known to not use indexes for several reasons)
2) Exact comparisons don't require a "Filter" step

Now, the question is... is this safe to do? The system appears to be much more responsive after this but I'm afraid that the query I changed (in dbmail_mailbox.c lines 1250 to 1263) is used somewhere else in a different way that can cause weird side effects. Any ideas? I am using the system over IMAP if that matters.

Also, when trying to delete a large number of messages (~1K - ~50K) the system just refuses to delete them. They keep coming back over and over again. I'm using the Mac OSX mail client and moving messages works great but deletions just haunt me. I don't really delete mail, I just move the spam to the spam folder and save everything so I haven't had time to check to see whether single deletions work. Is anyone else having the same issue? If it's already in the bug tracker I apologize, it's been a long time since I went there and honestly the performance issues come first.

Thanks for reading my epic novel. Hopefully someone can let me know if I'm on the right track or just a loon.

Thanks,
Tim Mattison
[EMAIL PROTECTED]
_______________________________________________
Dbmail-dev mailing list
[email protected]
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

Reply via email to