Hi Tim,

Tim Mattison wrote:
Hello all,
I've been running DBMail for years now and things have gone great. Recently, however, my server was downgraded due to hardware failure and has just never been the same (went from 750GB RAID5 to 250GB RAID1, 4GB RAM to 1GB RAM, two HT processors to one Core2Duo). Obviously that nailed performance so I've been struggling ever since.

A few days ago I got a new server that's a little more buff and performance was significantly better. I moved from some dev subversion snapshot of 2.2 to the latest stable snapshot (2.2.6-rc1) and decided to stop following the subversion releases for a while.

Fyi, some minor issues and one less than minor issue (bug #624) were fixed since 2.2.6-rc1.


  And now, the meat of the post...

Performance, while better, is still nowhere near as good as I remember it being. I've only got two users on this DBMail installation so I can't understand why things are so slow. I noticed that the following query was running pretty much all the time (with different values in the SQL, of course):

<snip>

  I added functional indexes for headervalue and headername like this:

create index dbmail_headervalue_3 on dbmail_headervalue(lower(headervalue));

Heh? I see that one is missing from create_tables.pgsql !!! And that's not the only one...

the whole list of obviously missing indexes for the postgres tables:

CREATE INDEX dbmail_headervalue_3 ON dbmail_headervalue(headervalue);
CREATE INDEX dbmail_subjectfield_2 ON dbmail_subjectfield(subjectfield);
CREATE INDEX dbmail_datefield_2 ON dbmail_datefield(datefield);
CREATE INDEX dbmail_referencesfield_2 ON 
dbmail_referencesfield(referencesfield);
CREATE INDEX dbmail_fromfield_2 ON dbmail_fromfield(fromaddr);
CREATE INDEX dbmail_fromfield_3 ON dbmail_fromfield(fromname);
CREATE INDEX dbmail_tofield_2 ON dbmail_tofield(toname);
CREATE INDEX dbmail_tofield_3 ON dbmail_tofield(toaddr);
CREATE INDEX dbmail_replytofield_2 ON dbmail_replytofield(replytoname);
CREATE INDEX dbmail_replytofield_3 ON dbmail_replytofield(replytoaddr);
CREATE INDEX dbmail_ccfield_2 ON dbmail_ccfield(ccname);
CREATE INDEX dbmail_ccfield_3 ON dbmail_ccfield(ccaddr);

for completion's sake also some missing mysql indexes:

CREATE INDEX dbmail_fromfield_1 ON dbmail_fromfield(fromname);
CREATE INDEX dbmail_fromfield_2 ON dbmail_fromfield(fromaddr);
CREATE INDEX dbmail_fromfield_1 ON dbmail_tofield(toname);
CREATE INDEX dbmail_fromfield_2 ON dbmail_tofield(toaddr);
CREATE INDEX dbmail_replytofield_1 ON dbmail_replytofield(replytoname);
CREATE INDEX dbmail_replytofield_2 ON dbmail_replytofield(replytoaddr);
CREATE INDEX dbmail_ccfield_1 ON dbmail_ccfield(ccname);
CREATE INDEX dbmail_ccfield_2 ON dbmail_ccfield(ccaddr);


create index dbmail_headername_2 on dbmail_headername(lower(headername));

That one was in the original create_tables.pgsql


  And rewrote the query like this:

SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id JOIN dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr = 994 AND status IN (0,1) AND lower(headername) = lower('MESSAGE-ID') AND lower(headervalue) = lower('<[EMAIL PROTECTED]>') ORDER BY message_idnr;


Which is not valid I'm afraid.

The performance increase is incredible. It cuts the computed cost approximately in half but cuts the actual execution time by several orders of magnitude due to:

1) Exact comparisons always using the index (ILIKE and strings starting with % are known to not use indexes for several reasons)
2) Exact comparisons don't require a "Filter" step

The problem is IMAP-SEARCH is not about exact comparisons, but about substring searches. In your case the search command was something like:

x UID SEARCH 1:* (HEADER MESSAGE-ID "<[EMAIL PROTECTED]>")

in which case we might deduce from the combination of the header-name (message-id) and the format of the header-value <...>, that the search-string implies an exact comparison, but most of the time things are not that obvious:

x UID SEARCH 1:* (HEADER FROM "peter")

So we might introduce exact matching under some circumstances, but we'll have to be very careful to weigh the performance benefits and the added complexity and maintenance cost.

Now, the question is... is this safe to do? The system appears to be much more responsive after this but I'm afraid that the query I changed (in dbmail_mailbox.c lines 1250 to 1263) is used somewhere else in a different way that can cause weird side effects. Any ideas? I am using the system over IMAP if that matters.

With the change you did, imap-search on substrings will be broken.


Also, when trying to delete a large number of messages (~1K - ~50K) the system just refuses to delete them. They keep coming back over and over again. I'm using the Mac OSX mail client and moving messages works great but deletions just haunt me. I don't really delete mail, I just move the spam to the spam folder and save everything so I haven't had time to check to see whether single deletions work. Is anyone else having the same issue? If it's already in the bug tracker I apologize, it's been a long time since I went there and honestly the performance issues come first.


Please keep running those analyzers to find performance bogs in the SQL code used. Aaron and I can only do so much. Help is much appreciated.


--
  ________________________________________________________________
  Paul Stevens                                      paul at nfg.nl
  NET FACILITIES GROUP                     GPG/PGP: 1024D/11F8CD31
  The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
[email protected]
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

Reply via email to