First, thanks, your suggestion worked. 

To my surprise, it was enough to add "limit 1" to the original query.

So:

select * from blocks,locations where locations.locid = blocks.locid AND ? >=
blocks.startIpNum AND ? <= blocks.endIpNum limit 1
takes about 1.398-005 seconds 

and 

select * from blocks,locations where locations.locid = blocks.locid AND ? >=
blocks.startIpNum AND ? <= blocks.endIpNum 
takes about 3 seconds.





Igor Tandetnik wrote:
> 
> Dani Valevski <[EMAIL PROTECTED]> wrote:
>>> I think I have a performance problem for a simple select with range.
>>>
>>> My Tables:
>>> CREATE TABLE locations(
>>>                locid    INTEGER PRIMARY KEY,
>>>                country TEXT,
>>>                region    TEXT,
>>>                city    TEXT,
>>>                postalCode TEXT,
>>>                latitude REAL,
>>>                longitude REAL,
>>>                dmaCode INTEGER,
>>>                areaCode INTEGER)
>>>
>>> CREATE TABLE blocks(
>>>                startIpNum INTEGER,
>>>                endIpNum INTEGER,
>>>                locId INTEGER)
>>>
>>> My Data:
>>> http://www.maxmind.com/app/geolitecity
>>> Blocks table has 2,776,436 rows
>>> Locations table has 159,488 rows
>>>
>>> After inserting the data I run analyze.
>>>
>>> My Query:
>>> select * from blocks,locations where locations.locid = blocks.locid
>>> AND ? >= blocks.startIpNum AND ? <= blocks.endIpNum
>>> (replace ? with a number)
>>>
>>> Performance issues:
>>> I use python's sqlite3 module to run the query.
>>> With this configuration it takes about 0.6 seconds to complete the
>>> query. I
>>> think this is too slow. I could write a binary tree myself and have
>>> searches
>>> like this take, O(log(num_rows)) which is
>>> 7*something_which_shouldnt_take_too_much. Am I wrong?
> 
> And what would you use as a key for this binary tree? I bet you would 
> utilize additional information that the DB engine doesn't have - that 
> your blocks don't overlap (they don't, right?) Try coming up with a 
> search strategy without making this assumption.
> 
> Try this: create an index on startIpNum, and run a query like this:
> 
> select * from blocks, locations
> where blocks.startIpNum <= ? and blocks.locid = locations.locid
> order by blocks.startIpNum desc limit 1;
> 
> This gives you the record with the largest value of startIpNum that is 
> still smaller than the threshold, and should be very fast. It can 
> produce a false positive - make the additional check for (? <= 
> startIpEnd) in your application code. Don't put this check into the 
> query though, or you will force it back into O(N) behavior in case your 
> target value doesn't fall within any block after all.
> 
> Igor Tandetnik 
> 
> 
> -----------------------------------------------------------------------------
> To unsubscribe, send email to [EMAIL PROTECTED]
> -----------------------------------------------------------------------------
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Performance-problem-for-a-simple-select-with-range-tf4711654.html#a13509241
Sent from the SQLite mailing list archive at Nabble.com.


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to