Note! IMHO
Creating test table:
-------------------------------
CREATE TABLE `thread` (
`threadid` int(10) unsigned NOT NULL auto_increment,
`site_user_id` varchar(32) NOT NULL default '',
`boardid` tinyint(3) unsigned NOT NULL default '0',
`subject` varchar(200) NOT NULL default '',
`modified` timestamp(14) NOT NULL,
`created` timestamp(14) NOT NULL,
`status` enum('L','P','H','D') NOT NULL default 'L',
PRIMARY KEY (`threadid`),
FULLTEXT KEY `subject` (`subject`),
KEY `boardid` (`boardid`),
KEY `site_user_id` (`site_user_id`),
KEY `created` (`created`),
KEY `status` (`status`)
) TYPE=MyISAM;
Insert data:
---------------------------------
<?php
set_time_limit( 6000 );
error_reporting( E_ALL );
define ( "NUM", '200000' );
mysql_pconnect( 'localhost', '---', '------' );
mysql_selectdb( 'richardboard' );
$name = "Pavel Lobovich, Belarus";
$text = "Smarty is a template engine for PHP. More specifically, it
facilitates a manageable way to separate application logic and content
from its presentation. This is best described in a situation where the
application programmer and the template designer play different roles,
or in most cases are not the same person. For example, let's say you are
creating a web page that is displaying a newspaper article. ";
$count = NUM;
$status = array( 'L', 'P', 'H', 'D' );
while ( $count-- ) {
$sql = 'INSERT INTO `thread` VALUES(NULL, "' . substr( $name, 0,
rand(6,24) ) . '", "' . rand( 1, 100) .'", "' . substr( $text, 0, rand(
1, 255 ) ) . '", NOW(), NOW(), "' . $status[ rand(0,3) ] . '" )';
mysql_query( $sql );
}
?>
Stats:
----------------------
Data: ~27MB
Index: ~34MB
Total: ~62MB
Post action:
----------------------
OPTIMIZE TABLE `thread`;
Query:
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `created` DESC
Explain:
----------------------
thread ref boardid,status boardid 1 const
2182(Hm...)
where used; Using filesort
2182(rows) * 650(average) = 1.4MB
Alteration (1):
----------------------
ALTER TABLE `thread` DROP INDEX `boardid`;
ALTER TABLE `thread` DROP INDEX `status`;
ALTER TABLE `thread` ADD INDEX `new_index` ( `boardid`, `status` );
Query:
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `threadid` DESC
Explain:
----------------------
thread ref new_index new_index 2 const,const
360(Good)
where used; Using filesort
360(rows) * 650(average) = 234KB
Benckmark script:
----------------------
...
function get_timestamp() {
list ($sec, $usec) = explode( ' ', microtime() );
return (float)$sec + (float)$usec;
}
$start = get_timestamp();
$sql = [TEST_QUERY];
$res = mysql_query( $sql );
$end = get_timestamp();
echo $end - $start;
...
Benchmark (Duron 1400, 512MB DDR333 PC2700, HD 5400RPM):
----------------------
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `created` DESC
0.012056972503662 s
SELECT * FROM `thread` WHERE `boardid` = "10" AND `status` = "L" ORDER
BY `threadid` DESC
0.0099239349365234 s
Alteration (2):
----------------------
ALTER TABLE `thread` CHANGE `threadid` `threadid` INT( 4 ) UNSIGNED NOT
NULL AUTO_INCREMENT
Main goals:
1. Less data size
2. `threadid` is now 32bit value, so comparation of two `threadid`
values is equal "CMP EAX, EBX" Assembler code ( 1 CPU tick ).
Alteration (3)
----------------------
CREATE TABLE thread_subjects (
thread_ref int(4) unsigned NOT NULL default '0',
subject varchar(200) NOT NULL default '',
PRIMARY KEY (thread_ref)
) TYPE=MyISAM;
INSERT INTO `thread_subjects` SELECT `threadid`, `subject` FROM
`thread`;
Query:
----------------------
SELECT * FROM `thread` LEFT JOIN `thread_subject` ON `thread_ref` =
`thread_id` WHERE `boardid` = "10" AND `status` = "L" ORDER BY
`threadid` DESC
Explain:
thread ref new_index new_index 2 const,const 398
where
used; Using filesort
thread_subjects eq_ref PRIMARY PRIMARY 4
thread.threadid 1
Time
----------------------
0.015573978424072, but:
398(rows) * 63(average) = 25KB
I think the filesort will be executed more faster
Alteration (4)
----------------------
Change `site_user_id` FROM VARCHAR(32) TO CHAR(32)
Query:
----------------------
SELECT * FROM `thread` LEFT JOIN `thread_subject` ON `thread_ref` =
`thread_id` WHERE `boardid` = "10" AND `status` = "L" ORDER BY
`threadid` DESC
Time:
----------------------
0.013375043869019 =)
Alteration (5):
----------------------
Move `site_user_id` to `thread_subjects`;
398(rows) * 37(average) = 14KB, sorting is more faster
etc...
Best regards, Pavel
> Hello Pavel,
>
> Tuesday, March 2, 2004, 12:33:17 PM, you wrote:
>
> PL>
> http://www.mysql.com/documentation/mysql/bychapter/manual_SQL_Syntax.html#ALTER_TABLE
> PL> ORDER BY allows you to create the new table with the rows in a specific
> PL> order. Note that the table will not remain in this order after inserts
> PL> and deletes. In some cases, it might make sorting easier for MySQL if
>
> Figured as much, thanks for the reference. I hope one day MySQL will
> produce a manual that isn't one horrendous long document per section.
>
> PL> You need to split the table and move "post messages" to another table if
> PL> you want to get higher perfomance for (2).
>
> With regard to disk seeking, here is my table structure:
>
> CREATE TABLE `thread` (
> `threadid` int(10) unsigned NOT NULL auto_increment,
> `site_user_id` varchar(32) NOT NULL default '',
> `boardid` tinyint(3) unsigned NOT NULL default '0',
> `subject` varchar(200) NOT NULL default '',
> `modified` timestamp(14) NOT NULL,
> `created` timestamp(14) NOT NULL,
> `status` enum('L','P','H','D') NOT NULL default 'L',
>
> PRIMARY KEY (`threadid`),
> FULLTEXT KEY `subject` (`subject`),
> KEY `boardid` (`boardid`),
> KEY `site_user_id` (`site_user_id`),
> KEY `created` (`created`),
> KEY `status` (`status`)
> ) TYPE=MyISAM;
>
> As well as moving the subject field to another table (because it's the
> only non-defined length field), would another way of speeding up the
> disk seek be to turn it from a varchar(200) into a char(200)? I know
> it means a larger table size, but MySQL should then be able to
> calculate exactly where to jump to in the file?
>
> PL> All MySQL functions should be very optimized, but there may be some
> PL> exceptions. BENCHMARK(loop_count,expression) is a great tool to find out
> PL> if this is a problem with your query.
>
> I will look at this now.
>
> I had been toying with the idea of creating a cache table that held
> the 200 most recent threads, pre-sequenced so I just bring back that
> instead of having to query the database unless they go beyond that 200
> limit.
>
> Your comments (and MySQL manual posts) have been very useful, thank
> you.
>
> --
> Best regards,
> Richard Davey
> http://www.phpcommunity.org/wiki/296.html
--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php