Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix

Leif Jackson Fri, 03 Apr 2009 18:39:31 -0700

All,
 Been a long time since I posted anything but I wanted to give some number
for the group as well.


 Just an FYI it also maybe hardware and version dependent, that is to say
that the performance measurements are not cpu bound therefor it may be IO
that is saturated which will happen on either mysql or postgresql but as
show below on MySQL 5.0.67-community on my customer's production dbmail
cluster sql server

Dell 6850 quad 3.2Ghz cpus
2.6.18-92.1.18.el5.centos.plus #1 SMP Wed Nov 26 07:28:20 EST 2008 x86_64
x86_64 x86_64 GNU/Linux
CentOS 5.2
16G ram
Dell MD3000 sas raid5 array 200G storage over 13 drives at 15k rpm using
LVM2 and ext3
InnoDB, binary logging turned on

mysql> SELECT COUNT(1) FROM dbmail_messageblks;
+----------+
| COUNT(1) |
+----------+
|   997728 |
+----------+
1 row in set (0.58 sec)

mysql> SELECT COUNT(1) FROM dbmail_messageblks;
+----------+
| COUNT(1) |
+----------+
|   997728 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM dbmail_messageblks;
+----------+
| COUNT(*) |
+----------+
|   997728 |
+----------+
1 row in set (0.57 sec)

mysql> SELECT COUNT(*) FROM dbmail_messageblks;
+----------+
| COUNT(*) |
+----------+
|   997728 |
+----------+
1 row in set (0.00 sec)

However it should be said that this box is very large and the customer never
fully utilized it so it is unloaded being Friday night at 9pm and he has
business customers but still it shows the time to load a similar dataset's
indexs into memory where mysql innodb has 8G of ram dedicated to it.

I would say that untill dbmail sql processes are CPU bound these kinds of
discussions on performance are not the most critical.

The slow query log on this box shows stuff like (taken with the fact that
this schema has been thru upgrades and maybe missing something, but I am
fairly certian it is up to date with indexs):

# Time: 090402  4:02:33
# u...@host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 3  Lock_time: 0  Rows_sent: 0  Rows_examined: 493594
SELECT p.id FROM dbmail_physmessage p LEFT JOIN dbmail_headervalue h
ON p.id= h.physmessage_id WHERE h.physmessage_id IS NULL;

# Time: 090403  4:02:24
# u...@host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 4  Lock_time: 0  Rows_sent: 0  Rows_examined: 994156
SELECT MIN(messageblk_idnr),MAX(is_header) FROM dbmail_messageblks GROUP BY
physmessage_id HAVING MAX(is_header)=0;

# Time: 090403  4:02:33
# u...@host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 5  Lock_time: 0  Rows_sent: 1  Rows_examined: 1022067
SELECT usr.user_idnr, sum(pm.messagesize), usr.curmail_size FROM
dbmail_users usr LEFT JOIN dbmail_mailboxes mbx ON mbx.owner_idnr =
usr.user_idnr LEFT JOIN dbmail_messages msg ON msg.mailbox_idnr =
mbx.mailbox_idnr LEFT JOIN dbmail_physmessage pm ON pm.id =
msg.physmessage_id AND msg.status < 2 GROUP BY usr.user_idnr,
usr.curmail_size HAVING ((SUM(pm.messagesize) <> usr.curmail_size) OR (NOT
(SUM(pm.messagesize) IS NOT NULL) AND usr.curmail_size <> 0));

# Time: 090403  4:02:37
# u...@host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 4  Lock_time: 0  Rows_sent: 0  Rows_examined: 1679722
SET timestamp=1238745757;
SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON
m.physmessage_id=p.id JOIN dbmail_headervalue v ON
v.physmessage_id=p.idJOIN dbmail_headername n ON v.headername_id=
n.id WHERE m.mailbox_idnr=1816 AND n.headername IN
('resent-message-id','message-id') AND
v.headervalue='<20090403080233.c12443b...@dbmail00.xxx>' AND p.internal_date
> NOW() - INTERVAL 3 DAY;

the last one seems to be the common but these three are consistant this may
be a better place to look in opimizations, I was under the impression that
any time you use  a HAVING clause it will cause a table scan?

Also for interest of comparison to the explain details below:

mysql> EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: dbmail_messageblks
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 5428069
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

mysql> EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: dbmail_messageblks
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 5428069
        Extra: Using index
1 row in set, 1 warning (0.00 sec)


On Fri, Apr 3, 2009 at 2:04 PM, Wallace Tan <wa...@pacific.net.sg> wrote:

> Michael Monnerie wrote:
> > On Freitag 03 April 2009 Wallace Tan wrote:
> >> select count(1) from t1;
> >
> > That would have been my next question. I've spoken once to Paul, because
> > dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it
> > seems MySQL would have a performance boost using count(1).
> >
> > Could you please try:
> > 1) first, SELECT COUNT(1) FROM dbmail_messageblks;
> > and afterwards
> > 2) SELECT COUNT(*) FROM dbmail_messageblks;
> >
> > The order is important: After the first select(), the table will be
> > cached, so the 2nd query will be faster. That, BTW, is part of the
> > explanation why your 2nd query was much faster than the 1st.
> > Still, count(1) should be faster than count(*) I would expect from the
> > thread you posted. I do not have a MySQL db with enough data to test
> > around. We're using PostgreSQL because things like that happen to exist
> > in MySQL since years, and I don't need a DBMS where I have to think for
> > it. I wonder why the devs don't manage to work around those problems.
> > But no flames please, everybody should use what they prefer.
> >
> > mfg zmi
>
> In MySQL (using InnoDB engine) there is no difference for between COUNT(*)
> or COUNT(1)
> because it is 'optimized' to use the PRIMARY index.
>
> The InnoDB PRIMARY key is a clustered index. See previous post.
>
> If I understand this correctly, the PRIMARY key (clustered index) is THE
> problem.
> So the only viable solution is to force the query to use a non-clustered
> index.
>
>
>  > SELECT COUNT(1) FROM dbmail_messageblks;
> +----------+
> | COUNT(1) |
> +----------+
> |   263339 |
> +----------+
> 1 row in set (2 min 30.44 sec)
>
>  > SELECT COUNT(*) FROM dbmail_messageblks;
> +----------+
> | COUNT(*) |
> +----------+
> |   263357 |
> +----------+
> 1 row in set (2 min 25.91 sec)
>
>  > EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G
> *************************** 1. row ***************************
>            id: 1
>   select_type: SIMPLE
>         table: dbmail_messageblks
>          type: index
> possible_keys: NULL
>           key: PRIMARY
>       key_len: 8
>           ref: NULL
>          rows: 6574840
>         Extra: Using index
>
>  > EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G
> *************************** 1. row ***************************
>            id: 1
>   select_type: SIMPLE
>         table: dbmail_messageblks
>          type: index
> possible_keys: NULL
>           key: PRIMARY
>       key_len: 8
>           ref: NULL
>          rows: 6574840
>         Extra: Using index
>
>
>
>
> --
> Regards,
> Wallace
> M:94500905
> _______________________________________________
> DBmail mailing list
> DBmail@dbmail.org
> http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
>

_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail

Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix

Reply via email to