Hello All,
 
I've using SQLite 2.8.16 with a music management application, and I'm
trying out why certain queries take a long time and to figure out how
SQLite uses my indexes.  I've got a master "media" table with a couple
of auxiliary tables like "artists", "albums",  and "genres".  Each
table's primary key is ____id (mediaid, artistid, albumid, and genreid),
and the media table contains columns to each of the side tables' primary
key (and an index for each one, too).
 
I found that a particular query  takes forever when run on a particular
database:
SELECT DISTINCT albums.albumid, albums.name_lower FROM artists, albums,
media WHERE albums.media_count > 0 AND media.albumid=albums.albumid AND
media.artistid=artists.artistid ORDER BY artists.name_key,
albums.name_key;
 
Well, the largest table (by far) in these queries is always the media
table, and I found that by moving the media table to the beginning of
the FROM list, it runs almost instantly:
SELECT DISTINCT albums.albumid, albums.name_lower FROM media, artists,
albums WHERE albums.media_count > 0 AND media.albumid=albums.albumid AND
media.artistid=artists.artistid ORDER BY artists.name_key,
albums.name_key;
 
Thus, when "media" is first, SQLite scans each record in the media table
and tests the smaller tables with the NotExists command which uses the
primary key to locate the record immediately.
 
I've been reading up on the virtual machine, op-codes, and performance
tips in the wiki for the past few days, and I've got a few questions
about what was happening:
 
1)  Shouldn't it be faster to iterate through one of the smaller tables
and then use an index on the media table to join with the other small
table?   
 
2)  It's unclear to me what order my FROM tables should be in.  In the
first query, did I accidently choose the absolute worst order for the
FROM list?  
 
So the tables in the FROM clause should appear in the order that tables
are introduced, with special consideration to the first table in the
list be the primary value we are SELECTing?  In the query above, that
would mean "FROM albums, media, artists".  (Both "FROM media, albums,
artists" and "FROM albums, media, artists" run too fast for eye-ball
speed compare.)
 
3)  What role, if any, do the ORDER BY columns play in the FROM order?
(I would assume none since by the time you've selected a row, you have
all of the data for the sorting of that row.)
 
4)  Is there a query optimizer for sqlite?  A program that automagically
discovered which queries were not in optimal form that could propose a
different order or alternate values for certain terms in FROM and WHERE
clauses?  I'm thinking that the input would be a query and an already
existing database (with indexes), and the output would be a (better)
optimized query.  This would be especially useful for SQLite since
everything must be hand-tuned.
 
Thanks!
Aaron

Reply via email to