[Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
I have been using DBMA to administer DBMail v2.2.10 On the home page of DBMA web interface, it displays only the top part of the page. I discovered the issue was due to a slow query in MySQL, after checking the slow-query-log in MySQL. The slow query is: SELECT COUNT(*) FROM dbmail_messageblks; Running this slow query took 138 seconds (2 min 18.09 sec) SELECT COUNT(*) FROM dbmail_messageblks; +--+ | COUNT(*) | +--+ | 262788 | +--+ 1 row in set (2 min 18.09 sec) After optimizing the SQL, it took 0.27 seconds. SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index); +--+ | COUNT(*) | +--+ | 262796 | +--+ 1 row in set (0.27 sec) The diff below fix this performance issue for DBMA.cgi 6109c6109 $dbh-prepare(SELECT COUNT(*) FROM $dbmail_messageblks_table use index(physmessage_id_index)); --- $dbh-prepare(SELECT COUNT(*) FROM $dbmail_messageblks_table); HTH other users of DBMA. BTW, I posted this here because I can't find the forum/mailing list for DBMA. And what's the best admin interface for DBMail? -- Regards, Wallace ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
On Freitag 03 April 2009 Wallace Tan wrote: SELECT COUNT(*) FROM dbmail_messageblks; 1 row in set (2 min 18.09 sec) After optimizing the SQL, it took 0.27 seconds. SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index); The question is: Why is MySQL so stupid not to use the index? That should be done automatically by the DBMS, that's its job. I would oppose against changing the query just because MySQL has a bug. Maybe you use a version that's known to be instable? mfg zmi -- // Michael Monnerie, Ing.BSc- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: curl -s http://zmi.at/zmi.asc | gpg --import // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
Michael Monnerie wrote: The question is: Why is MySQL so stupid not to use the index? That should be done automatically by the DBMS, that's its job. I would oppose against changing the query just because MySQL has a bug. Maybe you use a version that's known to be instable? I am using MySQL v5.0.77 MySQL IS using PRIMARY index for the slow query. However, after reading comment 19 at: http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/ I got the idea to use another index for the COUNT. Any MySQL experts can explain this performance gap? SHOW CREATE TABLE dbmail_messageblks\G *** 1. row *** Table: dbmail_messageblks Create Table: CREATE TABLE `dbmail_messageblks` ( `messageblk_idnr` bigint(21) NOT NULL auto_increment, `physmessage_id` bigint(21) NOT NULL default '0', `messageblk` longblob NOT NULL, `blocksize` bigint(21) NOT NULL default '0', `is_header` tinyint(1) NOT NULL default '0', PRIMARY KEY (`messageblk_idnr`), KEY `physmessage_id_index` (`physmessage_id`), KEY `physmessage_id_is_header_index` (`physmessage_id`,`is_header`), CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY (`physmessage_id`) REFERENCES `dbmail_physmessage` (`id`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=602519 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) EXPLAIN SELECT COUNT(*) FROM dbmail_messageblks\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 1930308 Extra: Using index 1 row in set (0.00 sec) EXPLAIN SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index)\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: physmessage_id_index key_len: 8 ref: NULL rows: 1930310 Extra: Using index 1 row in set (0.00 sec) -- Regards, Wallace ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
Wallace Tan wrote: I am using MySQL v5.0.77 MySQL IS using PRIMARY index for the slow query. However, after reading comment 19 at: http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/ I got the idea to use another index for the COUNT. Any MySQL experts can explain this performance gap? From http://capttofu.livejournal.com/12570.html InnoDB stores data in primary key order. If you don't specify a primary key, innodb creates one internally. InnoDB uses a clustered index, which means every index is stored with the primary key -- so be careful when making primary keys on InnoDB tables that are long. Clustered indexes give good performance for writes as well as selecting data by index. They are _slow_ with count(*) because: * InnoDB doesn't maintain # rows in the storage engine * Clustered indexes are slow when you perform count(*) because it is a count across the primary key, that operation has to traverse each index and data node. The way to get around this is to use select count(1) from t1; Or select count(some other indexed column) from t1; -- Regards, Wallace M:94500905 ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
On Freitag 03 April 2009 Wallace Tan wrote: select count(1) from t1; That would have been my next question. I've spoken once to Paul, because dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it seems MySQL would have a performance boost using count(1). Could you please try: 1) first, SELECT COUNT(1) FROM dbmail_messageblks; and afterwards 2) SELECT COUNT(*) FROM dbmail_messageblks; The order is important: After the first select(), the table will be cached, so the 2nd query will be faster. That, BTW, is part of the explanation why your 2nd query was much faster than the 1st. Still, count(1) should be faster than count(*) I would expect from the thread you posted. I do not have a MySQL db with enough data to test around. We're using PostgreSQL because things like that happen to exist in MySQL since years, and I don't need a DBMS where I have to think for it. I wonder why the devs don't manage to work around those problems. But no flames please, everybody should use what they prefer. mfg zmi -- // Michael Monnerie, Ing.BSc- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: curl -s http://zmi.at/zmi.asc | gpg --import // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
Michael Monnerie wrote: On Freitag 03 April 2009 Wallace Tan wrote: select count(1) from t1; That would have been my next question. I've spoken once to Paul, because dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it seems MySQL would have a performance boost using count(1). Could you please try: 1) first, SELECT COUNT(1) FROM dbmail_messageblks; and afterwards 2) SELECT COUNT(*) FROM dbmail_messageblks; The order is important: After the first select(), the table will be cached, so the 2nd query will be faster. That, BTW, is part of the explanation why your 2nd query was much faster than the 1st. Still, count(1) should be faster than count(*) I would expect from the thread you posted. I do not have a MySQL db with enough data to test around. We're using PostgreSQL because things like that happen to exist in MySQL since years, and I don't need a DBMS where I have to think for it. I wonder why the devs don't manage to work around those problems. But no flames please, everybody should use what they prefer. mfg zmi In MySQL (using InnoDB engine) there is no difference for between COUNT(*) or COUNT(1) because it is 'optimized' to use the PRIMARY index. The InnoDB PRIMARY key is a clustered index. See previous post. If I understand this correctly, the PRIMARY key (clustered index) is THE problem. So the only viable solution is to force the query to use a non-clustered index. SELECT COUNT(1) FROM dbmail_messageblks; +--+ | COUNT(1) | +--+ | 263339 | +--+ 1 row in set (2 min 30.44 sec) SELECT COUNT(*) FROM dbmail_messageblks; +--+ | COUNT(*) | +--+ | 263357 | +--+ 1 row in set (2 min 25.91 sec) EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 6574840 Extra: Using index EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 6574840 Extra: Using index -- Regards, Wallace M:94500905 ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
Michael Monnerie wrote: On Freitag 03 April 2009 Wallace Tan wrote: select count(1) from t1; That would have been my next question. I've spoken once to Paul, because dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it seems MySQL would have a performance boost using count(1). Could you please try: 1) first, SELECT COUNT(1) FROM dbmail_messageblks; and afterwards 2) SELECT COUNT(*) FROM dbmail_messageblks; The order is important: After the first select(), the table will be cached, so the 2nd query will be faster. That, BTW, is part of the explanation why your 2nd query was much faster than the 1st. Still, count(1) should be faster than count(*) I would expect from the thread you posted. I do not have a MySQL db with enough data to test around. We're using PostgreSQL because things like that happen to exist in MySQL since years, and I don't need a DBMS where I have to think for it. I wonder why the devs don't manage to work around those problems. But no flames please, everybody should use what they prefer. mfg zmi What the performance like for the same query using PostgreSQL? I would consider PostgreSQL for my DBMail store. Thanks! -- Regards, Wallace ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
In MySQL (using InnoDB engine) there is no difference for between COUNT(*) or COUNT(1) because it is 'optimized' to use the PRIMARY index. FWIW, there's no difference in PostgreSQL either. --Josh ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
On Freitag 03 April 2009 Josh Berkus wrote: FWIW, there's no difference in PostgreSQL either. Yes, but the other way round: In PostgreSQL it's always *FAST*. So, a little bit of difference ;-) EXPLAIN ANALYZE SELECT COUNT(*) FROM dbmail_messageblks; QUERY PLAN --- Aggregate (cost=58336.22..58336.23 rows=1 width=0) (actual time=1574.620..1574.621 rows=1 loops=1) - Seq Scan on dbmail_messageblks (cost=0.00..57162.37 rows=469537 width=0) (actual time=0.056..1019.980 rows=469683 loops=1) Total runtime: 1574.676 ms And that's just my home server, a XEN vm within a normal PC. mfg zmi -- // Michael Monnerie, Ing.BSc- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: curl -s http://zmi.at/zmi.asc | gpg --import // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
All, Been a long time since I posted anything but I wanted to give some number for the group as well. Just an FYI it also maybe hardware and version dependent, that is to say that the performance measurements are not cpu bound therefor it may be IO that is saturated which will happen on either mysql or postgresql but as show below on MySQL 5.0.67-community on my customer's production dbmail cluster sql server Dell 6850 quad 3.2Ghz cpus 2.6.18-92.1.18.el5.centos.plus #1 SMP Wed Nov 26 07:28:20 EST 2008 x86_64 x86_64 x86_64 GNU/Linux CentOS 5.2 16G ram Dell MD3000 sas raid5 array 200G storage over 13 drives at 15k rpm using LVM2 and ext3 InnoDB, binary logging turned on mysql SELECT COUNT(1) FROM dbmail_messageblks; +--+ | COUNT(1) | +--+ | 997728 | +--+ 1 row in set (0.58 sec) mysql SELECT COUNT(1) FROM dbmail_messageblks; +--+ | COUNT(1) | +--+ | 997728 | +--+ 1 row in set (0.00 sec) mysql SELECT COUNT(*) FROM dbmail_messageblks; +--+ | COUNT(*) | +--+ | 997728 | +--+ 1 row in set (0.57 sec) mysql SELECT COUNT(*) FROM dbmail_messageblks; +--+ | COUNT(*) | +--+ | 997728 | +--+ 1 row in set (0.00 sec) However it should be said that this box is very large and the customer never fully utilized it so it is unloaded being Friday night at 9pm and he has business customers but still it shows the time to load a similar dataset's indexs into memory where mysql innodb has 8G of ram dedicated to it. I would say that untill dbmail sql processes are CPU bound these kinds of discussions on performance are not the most critical. The slow query log on this box shows stuff like (taken with the fact that this schema has been thru upgrades and maybe missing something, but I am fairly certian it is up to date with indexs): # Time: 090402 4:02:33 # u...@host: mail[mail] @ dbmail00 [10.1.1.204] # Query_time: 3 Lock_time: 0 Rows_sent: 0 Rows_examined: 493594 SELECT p.id FROM dbmail_physmessage p LEFT JOIN dbmail_headervalue h ON p.id= h.physmessage_id WHERE h.physmessage_id IS NULL; # Time: 090403 4:02:24 # u...@host: mail[mail] @ dbmail00 [10.1.1.204] # Query_time: 4 Lock_time: 0 Rows_sent: 0 Rows_examined: 994156 SELECT MIN(messageblk_idnr),MAX(is_header) FROM dbmail_messageblks GROUP BY physmessage_id HAVING MAX(is_header)=0; # Time: 090403 4:02:33 # u...@host: mail[mail] @ dbmail00 [10.1.1.204] # Query_time: 5 Lock_time: 0 Rows_sent: 1 Rows_examined: 1022067 SELECT usr.user_idnr, sum(pm.messagesize), usr.curmail_size FROM dbmail_users usr LEFT JOIN dbmail_mailboxes mbx ON mbx.owner_idnr = usr.user_idnr LEFT JOIN dbmail_messages msg ON msg.mailbox_idnr = mbx.mailbox_idnr LEFT JOIN dbmail_physmessage pm ON pm.id = msg.physmessage_id AND msg.status 2 GROUP BY usr.user_idnr, usr.curmail_size HAVING ((SUM(pm.messagesize) usr.curmail_size) OR (NOT (SUM(pm.messagesize) IS NOT NULL) AND usr.curmail_size 0)); # Time: 090403 4:02:37 # u...@host: mail[mail] @ dbmail00 [10.1.1.204] # Query_time: 4 Lock_time: 0 Rows_sent: 0 Rows_examined: 1679722 SET timestamp=1238745757; SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON m.physmessage_id=p.id JOIN dbmail_headervalue v ON v.physmessage_id=p.idJOIN dbmail_headername n ON v.headername_id= n.id WHERE m.mailbox_idnr=1816 AND n.headername IN ('resent-message-id','message-id') AND v.headervalue='20090403080233.c12443b...@dbmail00.xxx' AND p.internal_date NOW() - INTERVAL 3 DAY; the last one seems to be the common but these three are consistant this may be a better place to look in opimizations, I was under the impression that any time you use a HAVING clause it will cause a table scan? Also for interest of comparison to the explain details below: mysql EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 5428069 Extra: Using index 1 row in set, 1 warning (0.00 sec) mysql EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G *** 1. row *** id: 1 select_type: SIMPLE table: dbmail_messageblks type: index possible_keys: NULL key: PRIMARY key_len: 8 ref: NULL rows: 5428069 Extra: Using index 1 row in set, 1 warning (0.00 sec) On Fri, Apr 3, 2009 at 2:04 PM, Wallace Tan wa...@pacific.net.sg wrote: Michael Monnerie wrote: On Freitag 03 April 2009 Wallace Tan wrote: select count(1) from t1; That would have been my next question. I've spoken once to Paul, because dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it seems MySQL would have a performance boost using count(1). Could you please try: 1) first, SELECT
Re: [Dbmail] OT: DBMail Administrator (DBMA) Performance Fix
Wallace Tan wrote: And what's the best admin interface for DBMail? Not to toot my own horn, but I did up a php based administration tool for managing your mailboxes and aliases along with added functionality to partition your users into manageable groups. See http://www.netvulture.com/nvcapps.php to download. Extract to your htdocs and follow INSTALL to get it up and running. -Jon -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ DBmail mailing list DBmail@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail