Re: Performance Problem on query kind of like a group by

Michael Stassen Thu, 12 Jan 2006 22:44:18 -0800

Jim Tyrrell wrote:

Everyone,


I finally feel let down by mysql after 5 years of
great use.  I break most things in weeks so this is a
heck of a record.  I am sure I am being a dummy on
this, but
am wondering if there is some setting somewhere to
help out a query like this.

Given a table like this:
FeatureID is an autonumber and the group field keeps
things together aka version control.
FeatureID, FeatureGroupID, FeatureName
1,         1,              "Version 1"
2,         1,              "Version 2"
3,         1,              "Version 3"
4,         1,              "Version 4"
...
999,       1,              "Version 999"


Over small sets I write a query like this and get
stuff quickly:
Select FeatureName from Feature F where FeatureID =
(Select max(FeatureID) from Feature where
FeatureGroupID = F.FeatureGroupID)
Giving me Verison 999

In MySQL for 1000 versioned records this takes almost
half a second on my machine.  On my machine for MS SQL
this returns right away basically in no time.  I make
sure cache is not in play by inserting a set of
records and then running the query.  Also the
FeatureGroupID is indexed and so is the featureID.

Is there a setting to make this work in mysql
administration somewhere?
Is there a word or set of words that I can search for?
I have looked high and low and have not been able to
make this work fast enough for me today.

Someone I work with suggested using group by and this
seems to be even worse then what I am doing in the
above query like seconds or longer. hmmmm!!!!

Also note that also can be considered records in the
table such that
1000,        2,              "Version 2 1"
1001,        2,              "Version 2 2"
1002,        2,              "Version 2 2"

The query Select FeatureName from Feature where
FeatureID = (Select max(FeatureID) from Feature)
will not work for me since i want all of the highest
FeatureID'd, FeatureGroupID'd stuff.

I have some settings in my my.cnf like:
join_buffer_size = 2M
read_buffer_size = 2M
sort_buffer_size = 3M
read_rnd_buffer_size = 3m

These have not had any effect.

Also caching is not an option because the first hit in
one of my more complicated use cases takes over 7
seconds.

Someone please help.

karma is yours if you can help me on this.

Thank You
Jim Tyrrell

Speed questions usually come down to having the right indexes on the tablesinvolved, and writing queries so the indexes are properly used. The settings inmy.cnf are rarely the issue.

First, you need to find the max FeatureID for each FeatureGroupID. This will gofastest with a multi-column index on (FeatureGroupID, FeatureID), in that order.Do you have that? {We'd know if you had included the output of "SHOW CREATETABLE Feature".} A multi-column index on (FeatureGroupID, FeatureID) will alsofunction as a single-column index on (FeatureGroupID), so you don't need aseparate index on that column.

Next, you want to find the rows in table Feature whose FeatureGroupID andFeatureID match the results of step one. This is a frequently asked question,with 3 solutions given in the manual<http://dev.mysql.com/doc/refman/4.1/en/example-maximum-column-group-row.html>.

The next question is whether or not mysql is using the index to optimize yourquery. Your query is the subquery solution from the above-referenced manualpage. Unfortunately, mysql doesn't always optimize queries with subqueriesproperly. To find out what mysql is doing, put EXPLAIN in front of your query<http://dev.mysql.com/doc/refman/4.1/en/explain.html>. Post the results if youneed help interpreting them.

If it turns out that the subquery has fooled mysql into not using the index tohelp your query, then the temporary table solution in the manual will almostcertainly be faster:


  # Replace "INT" as needed to match the types of these fields in table Feature
  CREATE TEMPORARY TABLE max_fids (FeatureGroupID INT, FeatureID INT);

  # Avoid anyone changing the data in the middle
  LOCK TABLES Feature READ;

  # Step 1: Get the max FeatureID for each FeatureGroupID
  INSERT INTO max_fids SELECT FeatureGroupID, MAX(FeatureID)
  FROM Feature GROUP BY FeatureGroupID;

  #Step 2: Find the rows in Feature which match the results in Step 1 via a JOIN
  SELECT f.FeatureName,
  FROM max_fids m
  JOIN Feature f
    ON f.FeatureGroupID = m.FeatureGroupID
   AND f.FeatureID = m.FeatureID;

  #clean up
  UNLOCK TABLES;
  DROP TABLE max_fids;

That looks a lot more complex, but as the index on (FeatureGroupID, FeatureID)will almost certainly be used for both steps, it should be very fast.


Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Performance Problem on query kind of like a group by

Reply via email to