RE: [sqlite] Performance problem for a simple select with range

Doug Wed, 31 Oct 2007 07:01:39 -0800

I'm not an SQL guru by any means, so seeing this made a light go on.  Does
that mean it is a good idea in the general case to always add "limit 1" to a
select that you know should only return 1 row?  I'm assuming this works
because the engine can short-cut out as soon as it finds that first matching
row.


> -----Original Message-----
> From: Dani Va [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, October 31, 2007 8:30 AM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Performance problem for a simple select with range
> 
> 
> First, thanks, your suggestion worked.
> 
> To my surprise, it was enough to add "limit 1" to the original query.
> 
> So:
> 
> select * from blocks,locations where locations.locid = blocks.locid AND ?
>=
> blocks.startIpNum AND ? <= blocks.endIpNum limit 1
> takes about 1.398-005 seconds
> 
> and
> 
> select * from blocks,locations where locations.locid = blocks.locid AND ?
>=
> blocks.startIpNum AND ? <= blocks.endIpNum
> takes about 3 seconds.
> 
> 
> 
> 
> 
> Igor Tandetnik wrote:
> >
> > Dani Valevski <[EMAIL PROTECTED]> wrote:
> >>> I think I have a performance problem for a simple select with range.
> >>>
> >>> My Tables:
> >>> CREATE TABLE locations(
> >>>                locid    INTEGER PRIMARY KEY,
> >>>                country TEXT,
> >>>                region    TEXT,
> >>>                city    TEXT,
> >>>                postalCode TEXT,
> >>>                latitude REAL,
> >>>                longitude REAL,
> >>>                dmaCode INTEGER,
> >>>                areaCode INTEGER)
> >>>
> >>> CREATE TABLE blocks(
> >>>                startIpNum INTEGER,
> >>>                endIpNum INTEGER,
> >>>                locId INTEGER)
> >>>
> >>> My Data:
> >>> http://www.maxmind.com/app/geolitecity
> >>> Blocks table has 2,776,436 rows
> >>> Locations table has 159,488 rows
> >>>
> >>> After inserting the data I run analyze.
> >>>
> >>> My Query:
> >>> select * from blocks,locations where locations.locid = blocks.locid
> >>> AND ? >= blocks.startIpNum AND ? <= blocks.endIpNum
> >>> (replace ? with a number)
> >>>
> >>> Performance issues:
> >>> I use python's sqlite3 module to run the query.
> >>> With this configuration it takes about 0.6 seconds to complete the
> >>> query. I
> >>> think this is too slow. I could write a binary tree myself and have
> >>> searches
> >>> like this take, O(log(num_rows)) which is
> >>> 7*something_which_shouldnt_take_too_much. Am I wrong?
> >
> > And what would you use as a key for this binary tree? I bet you would
> > utilize additional information that the DB engine doesn't have - that
> > your blocks don't overlap (they don't, right?) Try coming up with a
> > search strategy without making this assumption.
> >
> > Try this: create an index on startIpNum, and run a query like this:
> >
> > select * from blocks, locations
> > where blocks.startIpNum <= ? and blocks.locid = locations.locid
> > order by blocks.startIpNum desc limit 1;
> >
> > This gives you the record with the largest value of startIpNum that is
> > still smaller than the threshold, and should be very fast. It can
> > produce a false positive - make the additional check for (? <=
> > startIpEnd) in your application code. Don't put this check into the
> > query though, or you will force it back into O(N) behavior in case your
> > target value doesn't fall within any block after all.
> >
> > Igor Tandetnik
> >
> >
> >
----------------------------------------------------------------------------
-
> > To unsubscribe, send email to [EMAIL PROTECTED]
> >
----------------------------------------------------------------------------
-
> >
> >
> >
> 
> --
> View this message in context:
http://www.nabble.com/Performance-problem-for-a-
> simple-select-with-range-tf4711654.html#a13509241
> Sent from the SQLite mailing list archive at Nabble.com.
> 
> 
>
----------------------------------------------------------------------------
-
> To unsubscribe, send email to [EMAIL PROTECTED]
>
----------------------------------------------------------------------------
-


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

RE: [sqlite] Performance problem for a simple select with range

Reply via email to