Does this type of insert need a begin/commit around it?  And would WAL mode 
make any difference?



And try increasing your cache_size to at least the size of your database file.



cache_size is in pages so the default of 2000 is 2MB.

#pragma cache_size=200000

That'll make it 200MB which is probably more than enough for your example.



The query plan looks correct to my simple eye...

SQLite version 3.7.9 2011-11-01 00:52:41
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> CREATE TABLE tableA (id INT, pos INT);
sqlite> CREATE TABLE tableB (start INT, end INT);
sqlite>
sqlite> CREATE INDEX tableA_idx on tableA (pos asc);
sqlite> CREATE INDEX tableB_idx on tableB (start asc, end asc);
sqlite>
sqlite> CREATE TABLE ids (id INT);
sqlite>
sqlite> explain query plan INSERT INTO ids SELECT id FROM tableA, tableB WHERE 
pos BETWEEN start AND end;
0|0|0|SCAN TABLE tableA (~1000000 rows)
0|1|1|SEARCH TABLE tableB USING COVERING INDEX tableB_idx (start<?) (~125000 
rows)



Michael D. Black

Senior Scientist

Advanced Analytics Directorate

Advanced GEOINT Solutions Operating Unit

Northrop Grumman Information Systems

________________________________
From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Bo Peng [ben....@gmail.com]
Sent: Tuesday, December 27, 2011 9:39 AM
To: General Discussion of SQLite Database
Subject: EXT :[sqlite] Optimizing a query with range comparison.

Dear Sqlite experts,

The schema of my test tables is as follows:

CREATE TABLE tableA (id INT, pos INT);
CREATE TABLE tableB (start INT, end INT);

CREATE INDEX tableA_idx on tableA (pos asc);
CREATE INDEX tableB_idx on tableB (start asc, end asc);

CREATE TABLE ids (id INT);

tableA saves position of about 8 million objects, and table B saves
about 40 thousand ranges. I need to find out all ids in tableA that
falls into one of the ranges in tableB, and insert the results into
table ids. I am using a query

INSERT INTO ids SELECT id FROM tableA, tableB WHERE pos BETWEEN start AND end;

with indexes on tableA.pos and tableB.start, tableB.end (combined),
this query takes hours to execute. Is there anyway to optimize this
query? My understanding is that, if the query takes each pos and
compare it to all ranges, it will be slow. If it takes each range and
get all pos fall into the range, the query will be much faster. I have
tried to 'EXPLAIN' the query but I do not understand the output
because it looks different from what is described in
http://www.sqlite.org/eqp.html. I will appreciate it if someone can
tell me what sqlite is doing for this query.

> explain select id from tableA, tableB where pos between start and end;
0|Trace|0|0|0||00|
1|Goto|0|26|0||00|
2|OpenRead|1|2|0|2|00|
3|OpenRead|0|1446|0|2|00|
4|OpenRead|2|133259|0|keyinfo(1,BINARY)|00|
5|Rewind|1|22|0||00|
6|Column|1|0|1||00|
7|IsNull|1|21|0||00|
8|Affinity|1|1|0|d|00|
9|SeekGe|2|21|1|1|00|
10|Column|1|1|1||00|
11|IsNull|1|21|0||00|
12|Affinity|1|1|0|d|00|
13|IdxGE|2|21|1|1|01|
14|Column|2|0|2||00|
15|IsNull|2|20|0||00|
16|IdxRowid|2|2|0||00|
17|Seek|0|2|0||00|
18|Column|0|0|3||00|
19|ResultRow|3|1|0||00|
20|Next|2|13|0||00|
21|Next|1|6|0||01|
22|Close|1|0|0||00|
23|Close|0|0|0||00|
24|Close|2|0|0||00|
25|Halt|0|0|0||00|
26|Transaction|0|0|0||00|
27|VerifyCookie|0|6|0||00|
28|TableLock|0|2|0|tableB|00|
29|TableLock|0|1446|0|tableA|00|
30|Goto|0|2|0||00|

Many thanks in advance,
Bo
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to