Hi Tim,
Tim Mattison wrote:
Hello all,
I've been running DBMail for years now and things have gone great.
Recently, however, my server was downgraded due to hardware failure and
has just never been the same (went from 750GB RAID5 to 250GB RAID1, 4GB
RAM to 1GB RAM, two HT processors to one Core2Duo). Obviously that
nailed performance so I've been struggling ever since.
A few days ago I got a new server that's a little more buff and
performance was significantly better. I moved from some dev subversion
snapshot of 2.2 to the latest stable snapshot (2.2.6-rc1) and decided to
stop following the subversion releases for a while.
Fyi, some minor issues and one less than minor issue (bug #624) were fixed since
2.2.6-rc1.
And now, the meat of the post...
Performance, while better, is still nowhere near as good as I remember
it being. I've only got two users on this DBMail installation so I
can't understand why things are so slow. I noticed that the following
query was running pretty much all the time (with different values in the
SQL, of course):
<snip>
I added functional indexes for headervalue and headername like this:
create index dbmail_headervalue_3 on
dbmail_headervalue(lower(headervalue));
Heh? I see that one is missing from create_tables.pgsql !!! And that's not the
only one...
the whole list of obviously missing indexes for the postgres tables:
CREATE INDEX dbmail_headervalue_3 ON dbmail_headervalue(headervalue);
CREATE INDEX dbmail_subjectfield_2 ON dbmail_subjectfield(subjectfield);
CREATE INDEX dbmail_datefield_2 ON dbmail_datefield(datefield);
CREATE INDEX dbmail_referencesfield_2 ON
dbmail_referencesfield(referencesfield);
CREATE INDEX dbmail_fromfield_2 ON dbmail_fromfield(fromaddr);
CREATE INDEX dbmail_fromfield_3 ON dbmail_fromfield(fromname);
CREATE INDEX dbmail_tofield_2 ON dbmail_tofield(toname);
CREATE INDEX dbmail_tofield_3 ON dbmail_tofield(toaddr);
CREATE INDEX dbmail_replytofield_2 ON dbmail_replytofield(replytoname);
CREATE INDEX dbmail_replytofield_3 ON dbmail_replytofield(replytoaddr);
CREATE INDEX dbmail_ccfield_2 ON dbmail_ccfield(ccname);
CREATE INDEX dbmail_ccfield_3 ON dbmail_ccfield(ccaddr);
for completion's sake also some missing mysql indexes:
CREATE INDEX dbmail_fromfield_1 ON dbmail_fromfield(fromname);
CREATE INDEX dbmail_fromfield_2 ON dbmail_fromfield(fromaddr);
CREATE INDEX dbmail_fromfield_1 ON dbmail_tofield(toname);
CREATE INDEX dbmail_fromfield_2 ON dbmail_tofield(toaddr);
CREATE INDEX dbmail_replytofield_1 ON dbmail_replytofield(replytoname);
CREATE INDEX dbmail_replytofield_2 ON dbmail_replytofield(replytoaddr);
CREATE INDEX dbmail_ccfield_1 ON dbmail_ccfield(ccname);
CREATE INDEX dbmail_ccfield_2 ON dbmail_ccfield(ccaddr);
create index dbmail_headername_2 on dbmail_headername(lower(headername));
That one was in the original create_tables.pgsql
And rewrote the query like this:
SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON
m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.id
JOIN dbmail_headername n ON v.headername_id=n.id WHERE mailbox_idnr =
994 AND status IN (0,1) AND lower(headername) = lower('MESSAGE-ID') AND
lower(headervalue) = lower('<[EMAIL PROTECTED]>')
ORDER BY message_idnr;
Which is not valid I'm afraid.
The performance increase is incredible. It cuts the computed cost
approximately in half but cuts the actual execution time by several
orders of magnitude due to:
1) Exact comparisons always using the index (ILIKE and strings starting
with % are known to not use indexes for several reasons)
2) Exact comparisons don't require a "Filter" step
The problem is IMAP-SEARCH is not about exact comparisons, but about substring
searches. In your case the search command was something like:
x UID SEARCH 1:* (HEADER MESSAGE-ID "<[EMAIL PROTECTED]>")
in which case we might deduce from the combination of the header-name
(message-id) and the format of the header-value <...>, that the search-string
implies an exact comparison, but most of the time things are not that obvious:
x UID SEARCH 1:* (HEADER FROM "peter")
So we might introduce exact matching under some circumstances, but we'll have to
be very careful to weigh the performance benefits and the added complexity and
maintenance cost.
Now, the question is... is this safe to do? The system appears to be
much more responsive after this but I'm afraid that the query I changed
(in dbmail_mailbox.c lines 1250 to 1263) is used somewhere else in a
different way that can cause weird side effects. Any ideas? I am using
the system over IMAP if that matters.
With the change you did, imap-search on substrings will be broken.
Also, when trying to delete a large number of messages (~1K - ~50K)
the system just refuses to delete them. They keep coming back over and
over again. I'm using the Mac OSX mail client and moving messages works
great but deletions just haunt me. I don't really delete mail, I just
move the spam to the spam folder and save everything so I haven't had
time to check to see whether single deletions work. Is anyone else
having the same issue? If it's already in the bug tracker I apologize,
it's been a long time since I went there and honestly the performance
issues come first.
Please keep running those analyzers to find performance bogs in the SQL code
used. Aaron and I can only do so much. Help is much appreciated.
--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
[email protected]
http://twister.fastxs.net/mailman/listinfo/dbmail-dev