>-----Original Message-----
>From: Gavin Towey [mailto:[email protected]]
>Sent: Wednesday, May 26, 2010 7:39 PM
>To: [email protected]; [email protected]
>Subject: RE: Slow query using string functions
>
>Jerry,
>
>Are you sure this is really your explain plan for this query? That's not at
>all what I would expect to see.
>
>Regards,
>Gavin Towey
>
[JS] I'm as sure as I can be. Here's a lot more information:
==========
SET @PUBID = (SELECT pub_id FROM pub WHERE pub_code = 'DC');
DROP TEMPORARY TABLE IF EXISTS feed_new;
CREATE TEMPORARY TABLE feed_new (
new_title VARCHAR(255), INDEX (new_title)
);
INSERT INTO feed_new
VALUES
('Automotive Aftermarket in France - Channel Analysis to 2014'),
('Automotive Aftermarket in Germany - Channel Analysis to 2014'),
('Automotive Aftermarket in Italy - Channel Analysis to 2014'),
... about 900 more rows
SELECT
feed_new.new_title AS `New Title FROM Feed`,
prod.prod_pub_prod_id AS `Lib Code FROM DB`,
prod.prod_title AS `Title FROM DB`,
prod.prod_num AS `Prod Num`,
prod.prod_published AS `Published FROM DB`
FROM feed_new JOIN prod
ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) =
LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
ORDER BY feed_new.new_title;
==========
>explain
-> SELECT
-> feed_new.new_title AS `New Title FROM Feed`,
-> prod.prod_pub_prod_id AS `Lib Code FROM DB`,
-> prod.prod_title AS `Title FROM DB`,
-> prod.prod_num AS `Prod Num`,
-> prod.prod_published AS `Published FROM DB`
-> FROM feed_new JOIN prod
-> ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) =
LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
-> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
-> ORDER BY feed_new.new_title
-> \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: feed_new
type: index
possible_keys: NULL
key: new_title
key_len: 768
ref: NULL
rows: 1
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: prod
type: ref
possible_keys: pub_id,pub_id_2
key: pub_id
key_len: 48
ref: const
rows: 9817
Extra: Using where
2 rows in set (0.00 sec)
==========
I assume that pub_id is the best key to use to limit the rows pulled from
prod, since it cuts it down to 9817 rows from 110000. I'm guessing that even
though prod_title is a key (MUL), using it in the expression
LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
keeps it from being used as a key.
If I replace the expressions in the ON clause with simple matches, I get a
very different result:
>explain
-> SELECT
-> feed_new.new_title AS `New Title FROM Feed`,
-> prod.prod_pub_prod_id AS `Lib Code FROM DB`,
-> prod.prod_title AS `Title FROM DB`,
-> prod.prod_num AS `Prod Num`,
-> prod.prod_published AS `Published FROM DB`
-> FROM feed_new JOIN prod
-> ON feed_new.new_title = prod.prod_title
-> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
-> ORDER BY feed_new.new_title
-> \G
*************************** 1. row **************************
id: 1
select_type: SIMPLE
table: feed_new
type: index
possible_keys: new_title
key: new_title
key_len: 768
ref: NULL
rows: 882
Extra: Using index
*************************** 2. row **************************
id: 1
select_type: SIMPLE
table: prod
type: ref
possible_keys: pub_id,prod_title,pub_id_2,prod_title_fulltext
key: prod_title
key_len: 768
ref: giiexpr_db.feed_new.new_title
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
=====
Of course, in this case the number of qualifying rows is vastly different; but
I would think the optimization has to happen before the results of the SELECT
are known. No optimizer is that prescient. :-)
I'm not really surprised by the different optimizations, I'm surprised by the
difference in execution time. Using the function-based expressions in the ON
clause takes 2:03.38 minutes to return 267 rows (the first time, the second
time is a little faster); using a simple comparison with no functions takes
.03 seconds to return 1 row.
I benchmarked the expression I'm using
select benchmark(10000000,left("asbcdkdfsaklfjdakl;fjasdl;fjasdl;fjasd",
length("asbcdkdfsaklfjdakl;fjasdl;fjasdl;fjasd") - 5));
and the result was 2.87 seconds. That's ten million evaluations.
So where is the time going?
Regards,
Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
860.674.8796 / FAX: 860.674.8341
www.the-infoshop.com
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[email protected]