Re: How to speed up query of indexed column with 5M rows?

Rolando Edwards Mon, 09 Oct 2006 07:33:55 -0700

I have an interesting suggestion with regard to SELECT COUNT(*)

In the book Oracle8i Certified Professional DBA Certification Exam Guide (ISBN 
0072130601),
page 78 talks about what Oracle does to evaluate SELECT COUNT(*). The asterisk 
inside the
parentheses informs Oracle to look for any non-null column in a row to 
determine if the row
is to be counted. Hence, a row with all NULL columns would not be counted. The 
Oracle book
I just quoted recommends using SELECT COUNT(1) instead of SELECT COUNT(*) 
because the step
of looking for non-null columns is bypassed using SELECT COUNT(1).


I believe this would apply to MySQL as well. SELECT COUNT(*) FROM table is just 
as fast as
SELECT COUNT(1) FROM table on MyISAM tables. However, when it comes to using a 
WHERE clause,
COUNT(1) HAS TO EXECUTE FASTER THAN COUNT(*) regardless what Table Engine Type 
is.
Try this theory out and see.

By the way, you can drop the UNIQUE KEY 'id' in the 'flow' table.
The primary key is already a UNIQUE key. Why index 'id' twice?

For any SELECT query against 'flow', the index 'id' would never be used.
Just do EXPLAIN SELECT and see that a Primary Key should always get first
preference when hints are not suggested.

Please let me know if these sugguetions help.

----- Original Message -----
From: Jacques Marneweck <[EMAIL PROTECTED]>
To: Dan Buettner <[EMAIL PROTECTED]>
Cc: bowen <[EMAIL PROTECTED]>, [email protected]
Sent: Monday, October 9, 2006 9:18:01 AM GMT-0500 US/Eastern
Subject: Re: How to speed up query of indexed column with 5M rows?

Dan Buettner wrote:
> bowen -
>
> Right now, it appears your performance hinges on I/O to the disk drive.
>
> The reason you are seeing fast performance when querying against the
> primary key(SELECT COUNT(*)) is it is only reading from the index,
> which is probably all in memory. When you do a SELECT * even when
> against an indexed field, it has to access the table data, not just
> the index, so it hits the disk drive.
MyISAM tables always have that row count available hence the reason why
SELECT COUNT(*) FROM table is always fast.

Regards
--jm
>
> There are a couple of things you can do to improve speed, probably a
> fair amount in this case:
> 1 - increase amount of RAM in machine. This will allow more data to
> be cached in memory, for faster access. When MySQL first starts up,
> performance may be slow, as it reads from disk to fill the caches, but
> then will be fast going forward.
> 2 - install some sort of striped disk storage system to allow faster
> access to data on disk. Not as fast as RAM but still a big boost, and
> performance will be more consistent.
>
> I'd recommend doing both if possible.
>
> Dan
>
> On 10/9/06, bowen <[EMAIL PROTECTED]> wrote:
>> How to speed up query of indexed column with 5M rows?
>>
>> I have a table with more than 5M rows. (400M .MYD 430M .MYI).
>>
>> It took 27 seconds to do a common select...where... in the index column.
>>
>> I can not bear the long run.
>>
>> Vmstat show that system was bounded by IO busy.(Always more than 13000
>> bi/s, blocks input / second .)
>>
>> I increase some options to speed up query.
>> I try "SET GLOBAL key_buffer_size=256*1024*1024", the query still need
>> 14 seconds.
>> try "read_buffer_size = 2M" again, down to 9s.
>>
>>
>> It is a great improvement, but the result still can not be acceptable,
>> no matter to multiple query.
>>
>>
>> Is there any method to speed up the query. I found the select by
>> primary key of the 5M Row is very quickly(0.05s).
>>
>> I found if it is IO busy, process will hang much time for loading. If
>> index is cached in memory, process will be much more faster.
>>
>> Bottleneck is IO performance. How to improve mysql io performance?
>>
>>
>>
>> Mysql 4.1.21 + linux 2.6.17 + 512M memory + IDE Disk udma5
>>
>> -----------------------
>> CREATE TABLE /*!32312 IF NOT EXISTS*/ `flow` (
>> `id` bigint(20) unsigned NOT NULL auto_increment,
>> `name` varchar(255) NOT NULL default '',
>> `owner` int(10) NOT NULL default '0',
>> `uuid` varchar(36) NOT NULL default '',
>> `length` int(11) default '0',
>> `time` int(11) default '0',
>> PRIMARY KEY (`id`),
>> UNIQUE KEY `id` (`id`),
>> UNIQUE KEY `flow_uuidindex` (`uuid`),
>> KEY `flow_nameindex` (`name`),
>> KEY `flow_fk_owner` (`owner`)
>> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
>>
>>
>> mysql> explain select * from flow where owner=11251;
>> +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
>>
>> | id | select_type | table | type | possible_keys | key | key_len |
>> ref | rows | Extra |
>> +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
>>
>> | 1 | SIMPLE | flow | ALL | NULL | NULL | NULL | NULL | 5122593 |
>> Using where |
>> +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
>>
>> 1 row in set (0.00 sec)
>>
>> mysql> explain select * from flow where owner=11251;
>> Empty set (18.82 sec)
>>
>>
>> If there is no method to improve, can you advise me a redesign of table?
>> I think there is a way to do if I only use unique key in where.
>>
>> Thanks
>>
>>
>> Owner index was 253285 unique owner, probably means user. On average,
>> one user have 20 flows.
>> mysql> select count(*) from owner;
>> +----------+
>> | count(*) |
>> +----------+
>> | 253285 |
>> +----------+
>> 1 row in set (0.00 sec)
>>
>> Sorry for misspell of gid. It is uuid of the flow. I think I should
>> redesign the table. But how to solve the huge one-multiple
>> relationship.
>>
>> The flow table is the only largest table in the database, the others
>> are relative small just like the owner table size.
>>
>>
>> I have a design．　Create an additinal field in owner table to store the
>> pk(or uuid) of flows owned by this user. Than use a store procedure to
>> update this field called by changing to the flow table. But I think it
>> is a bad design, right ? And also mysql 4.1.x do not support store
>> procedure. I do not want to migrate the whole database, and I am sure
>> there are some incompatible.
>>
>> Thanks...
>>
>


-- 
Jacques Marneweck
http://www.powertrip.co.za/
http://www.powertrip.co.za/blog/

#include <std/disclaimer.h>


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: How to speed up query of indexed column with 5M rows?

Reply via email to