On 10/8/08, James Pringle <[EMAIL PROTECTED]> wrote: > Hi- > > I am new to sqlite, and am having a puzzling problem. I have read > that adding an INDEX to a sqlite table can never make a SELECT > statement slower. However, in my case, it seems to be making the > select statement an order of magnitude slower, which is not what I > wanted! What could be going on? > > I am calling SQLite from python 2.5 from fink useing the sqlite3 > module on OS X 10.5 with 2Gb of memory. My data base contains 32 > million lines of ocean hydrographic data, stored on disk, with the > table created with the following statement: > > CREATE TABLE hydro (lat REAL, lon REAL, year INTEGER, > month INTEGER, day INTEGER, time REAL, cast_id INTEGER, > depth REAL, T REAL, S REAL, water_depth REAL) > > When I perform a SELECT that returns about 0.6% of the data in the > database (about 200,000 records) > > SELECT lat, lon, depth FROM hydro WHERE depth>= 49.01 AND depth <= 50.99 > > It selects and returns the records in 82 seconds. I wished to improve > on this time, so I created an index with the following command: > > CREATE INDEX hydro_indx ON hydro (depth) > > I then tried the SELECT command again, and read the results into > memory, and it took 717 seconds!?! > > The "depth" field contains many distinct numeric values, with no > specific value matching more then 0.5% of the data base. When I DROP > the index with the command > > DROP INDEX hydro_indx > > The SELECT time returns to about 80 seconds, confirming that it is the > index which is slowing things down. What is going on? I have > repeated and confirmed these timings.
Most likely because it doesn't have much it can optimize (you note above that "no specific value matching more than 0.5% of the data" -- with 32 million rows, that is a lot of variation). Think of it like image compression -- if you have too much variation in your image... imagine the worst case -- every pixel is a different color. You can't really compress it. It is sort of like that (dunno if that is a good analogy). The db goes through the index wasting time. You drop the index and it does its thing anyway. Look through the archives with talk about possible controlling the use of an index in a query... just a few days ago. > > I have listened for disk chatter and monitored the system, and it does > not seem to be thrashing swap, or otherwise becoming unresponsive. > > I have two questions: > > 1) Why is the index making things slower? > 2) How can I make my SELECT statement faster? The primary > selection will be done > on the "depth" and "water_depth" keys. > > I thank you for your time. > > Cheers, > Jamie Pringle > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users