Sure post it data and the website for download.

I'd like to see if it's reproducible on other systems.

Thanks,
Ken



--- On Thu, 10/9/08, James Pringle <[EMAIL PROTECTED]> wrote:

> From: James Pringle <[EMAIL PROTECTED]>
> Subject: Re: [sqlite] Adding index to table makes SELECT much slower. Why?
> To: sqlite-users@sqlite.org
> Date: Thursday, October 9, 2008, 11:16 AM
> Hi all-
> 
>    Thanks for your many suggestions.  I have tried many of
> your suggestions,
> and found the following:
> 
> 
>    1. If I create the database with depth ordered from
> least to greatest,
>    and then create an index, the searches run an order of
> magnitude faster than
>    without an index.  I.e. the search for matching entries
> in the data base
>    concludes in about 9 seconds, instead of the 90 seconds
> it had before.
>    Thanks to "developir" for this suggestion!
>    2. searching via "rowid" makes no difference.
>    3. doing vacuum makes no difference, since this is a
> write once, read
>    many database.
>    4. ulimit on my machine is unlimited, so this was not
> important for me.
>    5. Cache size seems to make no difference -- and the
> SQLite documentation
>    says it should only be important for DELETE's and
> UPDATE's.
>    6. I like the idea of using RTREE, however, I don't
> want each of the
>    students who use this database to have to recompile the
> version of SQLite
>    that comes with their version of python.
> 
> Several of you, in particular Ken, suggested that I run SQL
> queries that
> would provide some output that would be helpful in figuring
> out what was
> going on.    The database is now more than fast enough for
> me, so I would
> not do this on my own.  However, if others on the mailing
> list would like to
> see this output in order to make SQLite better (or just for
> their own
> curiosity), please feel free to let me know, and I can make
> the runs.
> 
> I have a small subset of the data (450Mb) which exhibits
> the same
> behaviour.  The data is public (it is from the National
> Ocean Database), and
> so if anyone wants to see it I would be happy to put it on
> my web server.
> 
> Cheers,
> and thanks to everyone who helped me!
> Jamie Pringle
> 
> On Wed, Oct 8, 2008 at 6:50 PM, James Pringle
> <[EMAIL PROTECTED]> wrote:
> > Hi-
> >
> >  I am new to sqlite, and am having a puzzling problem.
>  I have read
> > that adding an INDEX to a sqlite table can never make
> a SELECT
> > statement slower.  However, in my case, it seems to be
> making the
> > select statement an order of magnitude slower, which
> is not what I
> > wanted!  What could be going on?
> >
> >  I am calling SQLite from python 2.5 from fink useing
> the sqlite3
> > module on OS X 10.5 with 2Gb of memory.  My data base
> contains 32
> > million lines of ocean hydrographic data, stored on
> disk, with the
> > table created with the following statement:
> >
> >    CREATE TABLE hydro (lat REAL, lon REAL, year
> INTEGER,
> >    month INTEGER, day INTEGER, time REAL, cast_id
> INTEGER,
> >    depth REAL, T REAL, S REAL, water_depth REAL)
> >
> > When I perform a SELECT that returns about 0.6% of the
> data in the
> > database (about 200,000 records)
> >
> >    SELECT lat, lon, depth FROM hydro WHERE depth>=
> 49.01 AND depth <=
> 50.99
> >
> > It selects and returns the records in 82 seconds.  I
> wished to improve
> > on this time, so I created an index with the following
> command:
> >
> >    CREATE INDEX hydro_indx ON hydro (depth)
> >
> > I then tried the SELECT command again, and read the
> results into
> > memory, and it took 717 seconds!?!
> >
> > The "depth" field contains many distinct
> numeric values, with no
> > specific value matching more then 0.5% of the data
> base.  When I DROP
> > the index with the command
> >
> >    DROP INDEX hydro_indx
> >
> > The SELECT time returns to about 80 seconds,
> confirming that it is the
> > index which is slowing things down.  What is going on?
>  I have
> > repeated and confirmed these timings.
> >
> > I have listened for disk chatter and monitored the
> system, and it does
> > not seem to be thrashing swap, or otherwise becoming
> unresponsive.
> >
> > I have two questions:
> >
> >    1) Why is the index making things slower?
> >    2) How can I make my SELECT statement faster?  The
> primary
> > selection will be done
> >       on the "depth" and
> "water_depth" keys.
> >
> > I thank you for your time.
> >
> > Cheers,
> > Jamie Pringle
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to