Hello,

I have imported a large (about 50,000 messages) folder into dbmail - and
got really SLOW performance. Opening the folder with a IMAP client
(Evolution, on the same local machine) takes ages! The promised 250
messages/second are just not there.

I have analyzed the logs and looked at the code. And I can clearly see
where the bottlenecks are. (This is why I am writing to the developers
list.)

However, I'm not an expert in database programming. So while finding the
problems was easy, I'd prefer it if someone else could fix them :) I
have some ideas how to do this; but this would be a "last resort".

So, when the folder gets opened, first the system spends some minutes
with dbmail-imapd hogging the CPU (a Celeron 2400). And here are the log
entriesbetween which this happens:

Oct 23 11:42:28 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT message_idnr, seen_flag, recent_flag FROM
dbmail_messages WHERE mailbox_idnr = '9' AND status < '2' AND unique_id
!= '' ORDER BY message_idnr ASC]
Oct 23 11:44:19 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT MAX(message_idnr) FROM dbmail_
messages WHERE unique_id != '']

I have found the corresponding place in the code and can see the CPU hog
there:

(db.c line 2534)

        /* alloc mem */
        mb->seq_list = (u64_t *) my_malloc(sizeof(u64_t) * mb->exists);
        if (!mb->seq_list) {
                /* out of mem */
                db_free_result();
                return -1;
        }

        for (i = 0; i < db_num_rows(); i++) {
                if (db_get_result(i, 1)[0] == '0')
                        mb->unseen++;
                if (db_get_result(i, 2)[0] == '1')
                        mb->recent++;

                mb->seq_list[i] = db_get_result_u64(i, 0);
        }

        db_free_result();

Well, with db_num_rows() in the tens of thousands one would expect a CPU
hog here! Three calls to db_get_result for every message - and
db_get_result performs seeking every single time, too. 

Now, after that got completed, the CPU was free - but the folder was not
opened yet, and the disk i/o was at the maximum. And here's what the log
tells us:

Oct 23 11:48:05 localhost dbmail/imap4d[2762]: IMAPClientHandler(): line
read for PID 2762
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: COMMAND: [A00021 UID
FETCH 1:* (FLAGS RFC822.SIZE INTERNALDATE BODY.PEEK[HEADER.FIELDS.NOT
(RECEIVED)])]
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[0]: 'FETCH'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[1]: '1:*'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[2]: '('
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[3]: 'FLAGS'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[4]: 'RFC822.SIZE'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[5]: 'INTERNALDATE'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[6]: 'BODY.PEEK'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[7]: '['
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[8]:
'HEADER.FIELDS.NOT'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[9]: '('
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[10]: 'RECEIVED'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[11]: ')'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[12]: ']'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: arg[13]: ')'
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: IMAPClientHandler():
Executing command uid...
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: db.c,db_acl_has_right:
checking ACL for user [3] on mailbox [9]
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT mailbox_idnr FROM dbmail_mailboxes WHERE
mailbox_idnr = '9' AND owner_idnr = '3']
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: next_fetch_item():
args[idx = 2] = FLAGS (returning 3)
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: next_fetch_item():
args[idx = 3] = RFC822.SIZE (returning 4)
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: next_fetch_item():
args[idx = 4] = INTERNALDATE (returning 5)
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: next_fetch_item():
args[idx = 11] = ] (returning 12)
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: Fetching msgID 1088
(fetch num 1089)
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT seen_flag, answered_flag, deleted_flag,
flagged_flag, draft_flag, recent_flag FROM dbmail_messages WHERE
message_idnr = '1088' AND status < '2' AND unique_id != '' AND
mailbox_idnr = '9']
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT pm.rfcsize FROM dbmail_physmessage pm,
dbmail_messages msg WHERE pm.id = msg.physmessage_id AND
msg.message_idnr = '1088' AND msg.status< '2' AND msg.unique_id != ''
AND msg.mailbox_idnr = '9']
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT DATE_FORMAT(pm.internal_date,'%Y-%m-%d %T') FROM
dbmail_physmessage pm, dbmail_messages msg WHERE msg.mailbox_idnr = '9'
AND msg.message_idnr = '1088' AND msg.unique_id!='' AND pm.id =
msg.physmessage_id]
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: dbmysql.c,db_query:
executing query [SELECT messageblk FROM dbmail_message
blks blk, dbmail_messages msg WHERE blk.physmessage_id =
msg.physmessage_id AND msg.message_idnr = '1088' ORDER BY
blk.messageblk_idnr ASC]
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: mime_readheader():
entering mime loop
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: mime_readheader(): found
double newline; header size: 61 lines
Oct 23 11:48:05 localhost dbmail/imap4d[2762]: Fetching msgID 1090
(fetch num 1091)

And so it goes on, doing four queries per message! No wonder it can only
process about 20 messages per second, instead of 250 as promised in the
README.

This actually makes dbmail unusable for me as a local storage - and eats
away performance in other cases too, though it may not be that
noticeable.

Yours, Mikhail Ramendik


Reply via email to