David Thieme wrote:
Scott,
Yes, the SELECT is very simple, but slow. I have tens of thousands of
records and I need the data very fast (embedded realtime system). Some
databases natively support spatial searches, using KD-trees or R-Trees or
Quad-trees to improve the search speed. I found an article that explains
how to implement a custom-spatial search in SQL 2007:
"Using Table Valued Functions in SQL Server
2005 to Implement a Spatial Data Library"
But the solution is very specific to SQL server. I thought there might be
other tricks that might be common for implementing a fast spatial search in
a database that doesn't natively support this feature.
David,
SQLite has no direct support for spatial searches, but you should be
able to get reasonable results for a table with thousands of records
using a couple of indexes on the latitude and longitude of the points,
assuming your range is a reasonably small part of your total search space.
Given a schema like this:
create table pts (
id integer primary key,
lat real,
lng real,
data text
);
You can create two indexes that will speed up the searches for points
within a rectangle.
create index lat_idx on pts(lat);
create index lng_idx on pts(lng);
Now, to do the search you can use a query like this:
select * from pts where id in
(
select id from pts where lat between :min_lat and :max_lat
intersect
select id from pts where lng between :min_lng and :max_lng
);
If you use explain query plan you can see how this will be executed:
sqlite> explain query plan select * from pts where id in
...> (
...> select id from pts where lat between :min_lat and :max_lat
...> intersect
...> select id from pts where lng between :min_lng and :max_lng
...> );
0|0|TABLE pts USING PRIMARY KEY
0|0|TABLE pts WITH INDEX lat_idx
0|0|TABLE pts WITH INDEX lng_idx
Or in all its excruciating detail using explain:
sqlite> explain select * from pts where id in
...> (
...> select id from pts where lat between :min_lat and :max_lat
...> intersect
...> select id from pts where lng between :min_lng and :max_lng
...> );
addr opcode p1 p2 p3
---- -------------- ---------- ----------
---------------------------------
0 Goto 0 78
1 Integer 0 0
2 OpenRead 0 2
3 SetNumColumns 0 4
4 MemLoad 0 0
5 If 0 63
6 MemInt 1 0
7 OpenEphemeral 3 0 keyinfo(1,BINARY)
8 SetNumColumns 3 1
9 OpenEphemeral 4 1 keyinfo(1,BINARY)
10 Integer 0 0
11 OpenRead 6 3 keyinfo(1,BINARY)
12 SetNumColumns 6 2
13 Variable 2 0 :max_lat
14 IsNull -1 29
15 MakeRecord 1 0 e
16 MemStore 2 1
17 Variable 1 0 :min_lat
18 IsNull -1 29
19 MakeRecord 1 0 e
20 MoveGe 6 29
21 MemLoad 2 0
22 IdxGE 6 29 +
23 Column 6 0
24 IsNull 1 28
25 IdxRowid 6 0
26 MakeRecord 1 0
27 IdxInsert 4 0
28 Next 6 21
29 Close 6 0
30 OpenEphemeral 5 1 keyinfo(1,BINARY)
31 Integer 0 0
32 OpenRead 7 4 keyinfo(1,BINARY)
33 SetNumColumns 7 2
34 Variable 4 0 :max_lng
35 IsNull -1 50
36 MakeRecord 1 0 e
37 MemStore 4 1
38 Variable 3 0 :min_lng
39 IsNull -1 50
40 MakeRecord 1 0 e
41 MoveGe 7 50
42 MemLoad 4 0
43 IdxGE 7 50 +
44 Column 7 0
45 IsNull 1 49
46 IdxRowid 7 0
47 MakeRecord 1 0
48 IdxInsert 5 0
49 Next 7 42
50 Close 7 0
51 Rewind 4 61
52 RowKey 4 0
53 NotFound 5 60
54 Column 4 0
55 NotNull -1 58
56 Pop 1 0
57 Goto 0 60
58 MakeRecord 1 0 c
59 IdxInsert 3 0
60 Next 4 52
61 Close 5 0
62 Close 4 0
63 Rewind 3 76
64 Column 3 0
65 IsNull -1 75
66 MustBeInt 1 75
67 NotExists 0 75
68 Rowid 0 0
69 Column 0 1
70 RealAffinity 0 0
71 Column 0 2
72 RealAffinity 0 0
73 Column 0 3
74 Callback 4 0
75 Next 3 64
76 Close 0 0
77 Halt 0 0
78 Transaction 0 0
79 VerifyCookie 0 3
80 Goto 0 1
81 Noop 0 0
Basically it scans through the section of the lat_idx between the lat
limits and saves the ids of the matching records, then it does the same
with the lng_idx. Next it loops through all the saved records in the
latitude range and checks if the same record was found in the longitude
range. If so the record id is saved in a final temp table that hold the
intersection set. Finally, it scans through that temp table and looks up
each matching row in the main points table using the primary key index
and returns the data from that row.
Say you have 10,000 points total and your points are uniformly
distributed. If your search rectangle covers 1% of the width and height
of your total search space, then the first two loops will find 1% of
those records each, or 100 records. The the loop that does the
intersection will look at the 100 records in the horizontal strip that
matches your 1% latitude range and locate the 1 record that is common to
100 records in the vertical strip that matches your longitude range. It
will then return all the data associated with that one record.
Because all the lookups are done using indexes (the intersection
operation lets you use both the latitude and longitude indexes, one in
each of the subselects), and the indexes contain the id field so it does
not have to lookup any records in the points table until the last step,
this query should run fairly quickly.
HTH
Dennis Cote
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------