I have what I thought was a simple, well-indexed query, but it
turns out that it's acting as a pretty big drag. The one thing
that's clearly a problem (though I'm not sure of the extent of
the problem), I'm not sure how to fix.
There are three tables: citations, subjects, and a many-to-many
table linking these. They look like this (edited to remove
extraneous fields):
CREATE TABLE `citation` (
`id` int(11) NOT NULL auto_increment,
`word` varchar(50) NOT NULL default '',
`last_modified` timestamp(14) NOT NULL,
`deleted` datetime default NULL,
PRIMARY KEY (`id`),
KEY `deleted` (`deleted`),
KEY `word` (`word`)
)
CREATE TABLE `subject` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(50) NOT NULL default '',
`last_modified` timestamp(14) NOT NULL,
PRIMARY KEY (`id`),
KEY `name` (`name`)
)
CREATE TABLE `citation_subject` (
`id` int(11) NOT NULL auto_increment,
`citation_id` int(11) NOT NULL default '0',
`subject_id` int(11) NOT NULL default '0',
PRIMARY KEY (`id`),
KEY `citation_id` (`citation_id`,`subject_id`)
)
A usual query is to get (some number of) the
citations for a given subject, ordering by the
"word" which is stripped of spaces and hyphens.
I don't know of any other way to accomplish
this ordering. The EXPLAIN looks like this:
mysql> EXPLAIN SELECT citation.*, REPLACE(REPLACE(citation.word,' ',''),
'-','') AS stripped_word
-> FROM citation, subject, citation_subject
-> WHERE subject.name = 'History'
-> AND citation_subject.subject_id = subject.id
-> AND citation_subject.citation_id = citation.id
-> AND (citation.deleted IS NULL OR citation.deleted = 0)
-> ORDER BY stripped_word\G
*************************** 1. row ***************************
table: subject
type: ref
possible_keys: PRIMARY,name
key: name
key_len: 50
ref: const
rows: 1
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
table: citation_subject
type: index
possible_keys: citation_id
key: citation_id
key_len: 8
ref: NULL
rows: 1247
Extra: Using where; Using index
*************************** 3. row ***************************
table: citation
type: eq_ref
possible_keys: PRIMARY,deleted
key: PRIMARY
key_len: 4
ref: citation_subject.citation_id
rows: 1
Extra: Using where
3 rows in set (0.00 sec)
----
The number of records involved is relatively small (a few thousands; in
practice this query would also have a LIMIT statement and would be
preceded by a COUNT(*)), but it's dragging down the application it's
running in. (I have a considerably more complex query that involves
millions of records and twice as many joins, that is faster.) I'm
running this in Perl.
Any suggestions? I'd like to get rid of the whole "temporary" and
"filesort" things, but I'm not sure if that's what matters given
that there's only 1 row being returned there.
Thanks.
Jesse Sheidlower
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]