I'm running a set of queries that look like this:
===============
SET @PUBID = (SELECT pub_id FROM pub WHERE pub_code = 'DC');
DROP TEMPORARY TABLE IF EXISTS feed_new;
CREATE TEMPORARY TABLE feed_new (
new_title VARCHAR(255), INDEX (new_title)
);
INSERT INTO feed_new
VALUES
('UK Investment Bonds 2010'),
('UK Protection 2010'),
('UK Personal Insurance Distribution 2010'),
('UK Private Medical Insurance 2010'),
...
('UK Private Motor Insurance 2010'),
('Wealth Management for Non-Resident Indians 2010'),
('Middle Eastern Cards Database 2010')
;
SELECT
feed_new.new_title AS `New Title FROM Feed`,
prod.prod_pub_prod_id AS `Lib Code FROM DB`,
prod.prod_title AS `Title FROM DB`,
prod.prod_num AS `Prod Num`,
prod.prod_published AS `Published FROM DB`
FROM feed_new JOIN prod
ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) =
LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
ORDER BY feed_new.new_title;
============
With a relatively small number of rows in `feed_new`, this can take many
seconds. With 163 rows in `feed_new`, compared against 11234 eligible rows in
prod, it took about 28 seconds. Here's what an EXPLAIN looks like:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_new
type: index
possible_keys: NULL
key: PRIMARY
key_len: 767
ref: NULL
rows: 1
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: prod
type: ref
possible_keys: pub_id,pub_id_2
key: pub_id
key_len: 48
ref: const
rows: 11040
Extra: Using where
=========
prod.pub_id is an indexed VARCHAR(15).
If I remove the string functions, I don't get what I want -- but the remaining
query runs in .05 seconds. Here's an EXPLAIN of that one:
===============
us-gii >EXPLAIN
-> SELECT
-> feed_new.new_title AS `New Title FROM Feed`,
-> prod.prod_pub_prod_id AS `Lib Code FROM DB`,
-> prod.prod_title AS `Title FROM DB`,
-> prod.prod_num AS `Prod Num`,
-> prod.prod_published AS `Published FROM DB`
-> FROM feed_new JOIN prod
-> ON feed_new.new_title = prod.prod_title
-> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
-> ORDER BY feed_new.new_title\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_new
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 767
ref: NULL
rows: 163
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: prod
type: ref
possible_keys: pub_id,prod_title,pub_id_2,prod_title_fulltext
key: prod_title
key_len: 768
ref: giiexpr_db.feed_new.new_title
rows: 1
Extra: Using where
================
Obviously the string manipulation is keeping MySQL from using `prod_title` as
a key, but I wouldn't have thought that using `pub_id` instead would be that
horrific.
Does anyone have any suggestions as to how to speed this business up? I can't
get away without some string manipulation, because I'm looking for "near
matches" by ignoring the year at the end of the title.
Regards,
Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
860.674.8796 / FAX: 860.674.8341
E-mail: [email protected]
Web site: www.the-infoshop.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[email protected]