Hi Gavin,Thanks very much, I'll implement as many of your suggestions as possible. The varchar(255)'s are inexcusable and I feel suitably ashamed :) The queries were generated by ActiveRecord (an ORM library for Ruby), although even if I had written them myself they would probably not be much better. Regards, Ciaran Lee
2009/9/24 Gavin Towey <gto...@ffn.com> > Hi Ciaran, > > So I think there's a couple things going on: > > 1. The explain plan for your "slow" query looks wrong, such as mysql is > confused. It's possible your index statistics are incorrect. Try ANALYZE > TABLE on listings and addresses. > > I think a sure way to fix it is to add STRAIGHT_JOIN to force the join > order. That should get rid of the temp table and filesort operations and > give faster results. > > SELECT > STRAIGHT_JOIN > listings.*, addresses.* > FROM > `listings` > JOIN > `addresses` ON `addresses`.addressable_id = `listings`.id > AND > `addresses`.addressable_type = 'Listing' > WHERE (addresses.parent_city_id = 3) > ORDER BY listings.id DESC LIMIT 1 > > > > 2. I need to make some comments about your table design: > > This column is AWFUL: > `addressable_type` varchar(255) DEFAULT NULL, > > Why have field that hold up to 255 characters and put a little string in it > like "Listing?" Why does it matter? Well it makes your indexes > disasterously bloated: > > KEY `index_addresses_on_parent_city_id_and_addressable_type` > (`parent_city_id`,`addressable_type`), > > > If you noticed in the explain plan, that index is 733 bytes *per row*. > Especially using utf8 means each character takes 3 bytes in the index. > That's terrible. That type field should be a foreign key tinyint or at the > very least be a much much shorter varchar field (such as 8 or 10) > > You have lots of those varchar(255) fields, which looks like lazy design -- > by not gathering correct requirements and designing accordingly you will > hurt your database performance, waste disk space and cause yourself all > kinds of future problems. > > 3. Why are you using OUTER JOIN? > > It looks to me like you're using it because you don't know the difference, > since you're not looking for NULL rows or anything. In fact, it looks like > mysql is smart enough to know that you've negated the OUTER JOIN by putting > conditions on the joined tables in the WHERE clause, and convert then to > INNER JOINS. Don't rely on that! Use the correct join type. > > Those queries > Regards, > Gavin Towey > > -----Original Message----- > From: Ciaran Lee [mailto:ciaran....@gmail.com] > Sent: Tuesday, September 22, 2009 1:32 PM > To: mysql@lists.mysql.com > Subject: query optimization question (my struggle against 'using temporary; > using filesort') > > Hi, > > I hope this is the right place to ask a question about query optimization. > > Background: > I have a database which has events, which occur in places (listings). > Places > have addresses, and addresses belong to a city. I can select the latest > event within a particular city very efficiently (less than 2ms), but > selecting the latest listing within a city is REALLY slow (10-20 seconds) > despite being almost a subset of the event query. > > I have been working on this for about a day, and have tried all sorts of > tweaks to the indexes but to no avail. I always seem to end up with 'using > temporary; using filesort' as the 'extra' content in the explain result. If > anyone has a suggestion for what I might do to fix this, I'd really > appreciate it. If not, I could further de-normalize the database for > performance reasons, but I would feel dirty for doing so. > > Here is the fast query (select the latest event within a particular city), > and it's explain. > SELECT > events.*, listings.*, addresses.* > FROM > `events` > LEFT OUTER JOIN > `listings` ON `listings`.id = `events`.listing_id > LEFT OUTER JOIN > `addresses` ON `addresses`.addressable_id = `listings`.id > AND > `addresses`.addressable_type = 'Listing' > WHERE (addresses.parent_city_id = 3) > ORDER BY events.id DESC LIMIT 1 > > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+ > | id | select_type | table | type | possible_keys > > | key > | key_len | ref | rows | Extra | > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+ > | 1 | SIMPLE | events | index | index_events_on_listing_id > > | PRIMARY > | 4 | NULL | 1 | | > | 1 | SIMPLE | listings | eq_ref | PRIMARY > > | PRIMARY > | 4 | ratemyarea.events.listing_id | 1 | Using where | > | 1 | SIMPLE | addresses | ref | > > index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city > | index_addresses_on_addressable_type_and_addressable_id | 773 | const, > ratemyarea.listings.id | 1 | Using where | > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+ > > > Here is the slow query (select the latest listing within a particular > city), > and it's explain > SELECT > listings.*, addresses.* > FROM > `listings` > LEFT OUTER JOIN > `addresses` ON `addresses`.addressable_id = `listings`.id > AND > `addresses`.addressable_type = 'Listing' > WHERE (addresses.parent_city_id = 3) > ORDER BY listings.id DESC LIMIT 1 > > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+ > | id | select_type | table | type | possible_keys > > | key | key_len | ref > | rows | Extra > | > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+ > | 1 | SIMPLE | addresses | ref | > > index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city > | addressable_id_type_city | 773 | const,const > | > 25680 | Using where; Using temporary; Using filesort | > | 1 | SIMPLE | listings | eq_ref | PRIMARY > > | PRIMARY | 4 | > ratemyarea.addresses.addressable_id | 1 | > | > > > +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+ > > > Here is the structure of the tables: > > CREATE TABLE `addresses` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `postal_code` varchar(255) DEFAULT NULL, > `addressable_type` varchar(255) DEFAULT NULL, > `addressable_id` int(11) DEFAULT NULL, > `parent_city_id` int(11) DEFAULT NULL, > `lat` decimal(15,10) DEFAULT NULL, > `lng` decimal(15,10) DEFAULT NULL, > PRIMARY KEY (`id`), > KEY `index_addresses_on_lat_and_lng` (`lat`,`lng`), > KEY `index_addresses_on_parent_city_id_and_addressable_type` > (`parent_city_id`,`addressable_type`), > KEY `index_addresses_on_addressable_type_and_addressable_id` > (`addressable_type`,`addressable_id`) > ) ENGINE=InnoDB AUTO_INCREMENT=120513 DEFAULT CHARSET=utf8; > > > > # Dump of table cities > # ------------------------------------------------------------ > > CREATE TABLE `cities` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `name` varchar(255) NOT NULL, > `country_id` int(11) NOT NULL, > `lat` decimal(15,10) NOT NULL, > `lng` decimal(15,10) NOT NULL, > PRIMARY KEY (`id`), > KEY `index_cities_on_name` (`name`), > KEY `index_cities_on_country_id` (`country_id`), > KEY `index_cities_on_lat_and_lng` (`lat`,`lng`) > ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8; > > > > # Dump of table events > # ------------------------------------------------------------ > > CREATE TABLE `events` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `listing_id` int(11) NOT NULL, > `description` text, > `title` varchar(150) NOT NULL, > PRIMARY KEY (`id`), > KEY `index_events_on_title` (`title`), > KEY `index_events_on_listing_id` (`listing_id`) > ) ENGINE=InnoDB AUTO_INCREMENT=6359 DEFAULT CHARSET=utf8; > > > > # Dump of table listings > # ------------------------------------------------------------ > > CREATE TABLE `listings` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `name` varchar(255) DEFAULT NULL, > `latitude` decimal(15,10) DEFAULT NULL, > `longitude` decimal(15,10) DEFAULT NULL, > `description` text, > PRIMARY KEY (`id`), > KEY `index_listings_on_place` (`name`), > KEY `index_listings_on_latitude_and_longitude` (`latitude`,`longitude`) > ) ENGINE=InnoDB AUTO_INCREMENT=109358 DEFAULT CHARSET=utf8; > > > > # Dump of table users > # ------------------------------------------------------------ > > CREATE TABLE `users` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `email` varchar(255) DEFAULT NULL, > PRIMARY KEY (`id`), > KEY `index_users_on_email` (`email`), > KEY `index_users_on_latitude_and_longitude` (`latitude`,`longitude`) > ) ENGINE=InnoDB AUTO_INCREMENT=1972 DEFAULT CHARSET=utf8; > > The information contained in this transmission may contain privileged and > confidential information. It is intended only for the use of the person(s) > named above. If you are not the intended recipient, you are hereby notified > that any review, dissemination, distribution or duplication of this > communication is strictly prohibited. If you are not the intended recipient, > please contact the sender by reply email and destroy all copies of the > original message. >