Do you have any way of narrowing this down to specific queries that are taking the longest and/or are being executed the most? That would identify which low-level database functions are being called, then we can just trace our way up the call chain to see who's misbehaving or acting on a flawed design. Also, if you could run similar tests against the latest 1.2, it would help to give a frame of reference, particularly for my delivery chain design.
Aaron Thomas Mueller <[EMAIL PROTECTED]> said: > Hi, > > I did some tests with cvs head from today and found out that it is quite > slow when messages are inserted. I have a really aged machine for tests, > but even for that machine it is too slow: > > Pentium 166, 128 MB RAM > Debian Sarge > Vanilla kernel 2.6.1 > all services stopped that aren't required > IMAP protected using stunnel > Postgres 7.3.4 on a fast 7200 rpm 80 GB HD > > Copying about 3000 mails using IMAP took more than an hour (!) in my > LAN. > > I've enabled postgres stats collector and did some tests: I have set up > a new database with one user, and two mailboxes, INBOX and INBOX/Test > and renamed dbmail-smtp, so my MTA couldn't insert any messages. > > I used the following SQL queries to get the stats: > > select xact_commit from pg_stat_database where datname='dbmail' > To get the overall number of transactions. > > select * from pg_stat_user_tables > To get detailed seq scans/INSERT/UPDATE/DELETE counts for every row. > > select pg_stat_reset() > To reset all stats to zero. > > I've reset the stats, started Evolution 1.4.5 and copied 2872 small > mails (sometimes spam can be useful) to my dbmail test machine. > Unfortunately Evolution read about 2% of the mails after I've copied > them, that's why some seq_scan numbers might be somewhat to high. > I've attached the table because of the line breaks too. > > -------------------------------------------------- > xact_commit > ------------- > 66906 > (1 row) > > relid | schemaname | relname | seq_scan | seq_tup_read | > idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del > -------+------------+----------------------+----------+--------------+----------+---------------+-----------+-----------+----------- > 83072 | public | aliases | 0 | 0 | > 0 | 0 | 0 | 0 | 0 > 83082 | public | users | 0 | 0 | > 11490 | 11490 | 0 | 5745 | 0 > 83095 | public | mailboxes | 20295 | 98603 | > 0 | 0 | 0 | 0 | 0 > 83116 | public | subscription | 0 | 0 | > 0 | 0 | 0 | 0 | 0 > 83128 | public | acl | 0 | 0 | > 0 | 0 | 0 | 0 | 0 > 83151 | public | physmessage | 17490 | 25494744 | > 0 | 0 | 2872 | 2958 | 0 > 83160 | public | messages | 23436 | 34300164 | > 0 | 0 | 2872 | 2872 | 0 > 83189 | public | messageblks | 86 | 493984 | > 0 | 0 | 5744 | 0 | 0 > 83206 | public | auto_notifications | 0 | 0 | > 0 | 0 | 0 | 0 | 0 > 83219 | public | auto_replies | 0 | 0 | > 0 | 0 | 0 | 0 | 0 > [..] > (15 rows) > > pg_stat_reset > --------------- > t > (1 row) > -------------------------------------------------- > > There are several problems: > - 23 transactions for every message copied (!) > - the users table is updated twice for every message > - lot of sequential scans > > We had that discussion some months ago yet - we really, really need > transactions. I didn't check the code, but is it possible to encapsulate > the database access for one mail into one transaction without major > rewrites? That would speed up a lot. > > > Then I checked the performance of dbmail-maintenance (cfpd). > > There is one seqScan on mailboxes for every message even if there is > nothing to do - several thousand mailboxes and some million mails and > that will take a really long time. > > I've marked all messages as deleted and ran maintenance the first time > (set delete status for deleted messages) - 2872 updates on messages but > only 31 transactions, that looks very good. > The second run of maintenance (delete messages with delete status) > looked worse: 2872 delete on physmessage and messages, 5744 delete on > messageblks but 14389 transactions. > > > It should be easy to add an index for mailboxes to remove the sequential > scans, but I don't know what query forces postgres to do the seqScan. > > I checked the code and found this query for example: > snprintf(query, DEF_QUERYSIZE, > "SELECT mailbox_idnr FROM mailboxes " > "WHERE LOWER(name) = LOWER('%s') " > "AND owner_idnr='%llu'", name, owner_idnr); > We have this index: > CREATE INDEX mailboxes_name_idx ON mailboxes(name); > But we would need: > CREATE INDEX mailboxes_lower_name_idx ON mailboxes(lower(name)); > > Is IMAP case sensitive? Are folders like INBOX/test and INBOX/Test > possible? If it's insensitive we could lower() all queries on names and > create an index on lower(name). > > > -- > MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B) > > --