Re: [sqlite] rtree performance problems?
I'll have to look again. I was sure I had mucked with estimatedCost and found that it wasn't doing the trick for what I meant to do - but somehow the context has swapped out of my short-term memory, so I'll have to recreate it before I can say why this was the case. -scott On Wed, Sep 10, 2008 at 1:43 PM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > see sqlite3_index_info.estimatedCost > > http://www.sqlite.org/cvstrac/chngview?cn=5649 > >> AFAICT, when you have a join where one table has a good index, the >> virtual table cannot signal that it has an even better index. I could >> not follow the index-selection logic well enough to have any >> suggestions how to improve things. In the fts case, this breaks the >> query because the MATCH operator _only_ works when used to access the >> index. >> >> -scott > > -- > View this message in context: > http://www.nabble.com/rtree-performance-problems--tp19035092p19422728.html > Sent from the SQLite mailing list archive at Nabble.com. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
see sqlite3_index_info.estimatedCost http://www.sqlite.org/cvstrac/chngview?cn=5649 > AFAICT, when you have a join where one table has a good index, the > virtual table cannot signal that it has an even better index. I could > not follow the index-selection logic well enough to have any > suggestions how to improve things. In the fts case, this breaks the > query because the MATCH operator _only_ works when used to access the > index. > > -scott -- View this message in context: http://www.nabble.com/rtree-performance-problems--tp19035092p19422728.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
[Sorry for the blast from the past.] I think this class of problem does also happen on fts, there was a thread on August 7 on sqlite-dev about it. Unfortunately, I don't see any open-access web-mirrors of that list to refer to, but here's a members-only ref: http://sqlite.org:8080/cgi-bin/mailman/private/sqlite-dev/2008-August/000330.html AFAICT, when you have a join where one table has a good index, the virtual table cannot signal that it has an even better index. I could not follow the index-selection logic well enough to have any suggestions how to improve things. In the fts case, this breaks the query because the MATCH operator _only_ works when used to access the index. -scott On Tue, Aug 19, 2008 at 12:31 PM, Hartwig Wiesmann <[EMAIL PROTECTED]> wrote: > > Am 18.08.2008 um 21:32 schrieb Dennis Cote: > >> Thomas Sailer wrote: >>> >>> Interestingly, the original query is extremely compute-bound, there >>> is >>> almost no disk activity! >>> >>> Looking at the output opcodes from the queries, I can't see any >>> significant difference. Though I have to admit I'm by far no expert >>> in >>> vmdb opcodes... >>> >> >> You can use "explain query plan " to get some clues as to >> what is happening, but the vdbe code shows it best. >> >> The original query is doing a full table scan through all 2.6 million >> records in the main table, and for each record it uses the rtree to >> locate the 20 records inside the rectangle set by your limits, then it >> compares the id of each of these records to see if it matches the id >> of >> the main table record. For each match it dumps all the data in both >> records. >> >> The second query use the rtree index to locate the 20 records in the >> limit rectangle, and then uses the btree index to locate those 20 >> records quickly, and then dumps the data for that record. >> >> HTH >> Dennis Cote >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > Hello, > > I had an identical experience a few days ago and posted it under a > different topic. Because SQLite is not doing well as mentioned in the > example of the documentation I suggest to change the documentation > using the proposed solution. I also had to find it out the hard way > and I think by modifying the documentation there are at least a few > people less who have to go that way. > > BTW: I suppose the same problem occurs when using fts? > > Hartwig > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
Am 18.08.2008 um 21:32 schrieb Dennis Cote: > Thomas Sailer wrote: >> >> Interestingly, the original query is extremely compute-bound, there >> is >> almost no disk activity! >> >> Looking at the output opcodes from the queries, I can't see any >> significant difference. Though I have to admit I'm by far no expert >> in >> vmdb opcodes... >> > > You can use "explain query plan " to get some clues as to > what is happening, but the vdbe code shows it best. > > The original query is doing a full table scan through all 2.6 million > records in the main table, and for each record it uses the rtree to > locate the 20 records inside the rectangle set by your limits, then it > compares the id of each of these records to see if it matches the id > of > the main table record. For each match it dumps all the data in both > records. > > The second query use the rtree index to locate the 20 records in the > limit rectangle, and then uses the btree index to locate those 20 > records quickly, and then dumps the data for that record. > > HTH > Dennis Cote > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Hello, I had an identical experience a few days ago and posted it under a different topic. Because SQLite is not doing well as mentioned in the example of the documentation I suggest to change the documentation using the proposed solution. I also had to find it out the hard way and I think by modifying the documentation there are at least a few people less who have to go that way. BTW: I suppose the same problem occurs when using fts? Hartwig ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
Thomas Sailer wrote: > > Interestingly, the original query is extremely compute-bound, there is > almost no disk activity! > > Looking at the output opcodes from the queries, I can't see any > significant difference. Though I have to admit I'm by far no expert in > vmdb opcodes... > You can use "explain query plan " to get some clues as to what is happening, but the vdbe code shows it best. The original query is doing a full table scan through all 2.6 million records in the main table, and for each record it uses the rtree to locate the 20 records inside the rectangle set by your limits, then it compares the id of each of these records to see if it matches the id of the main table record. For each match it dumps all the data in both records. The second query use the rtree index to locate the 20 records in the limit rectangle, and then uses the btree index to locate those 20 records quickly, and then dumps the data for that record. HTH Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
On Mon, 2008-08-18 at 11:16 -0600, Dennis Cote wrote: > Does this query run faster? > > select * from mapelements > where ID in > ( > select ID from mapelements_rtree > where mapelements_rtree.NELAT>=7900 > and mapelements_rtree.SWLAT<=8000 > and mapelements_rtree.NELON>=7900 > and mapelements_rtree.SWLON<=8000 > ); Yes; this query terminates in less than a second. Technically it's not quite the same thing (doesn't return the rtree columns), but it's what I want :-) Interestingly, the original query is extremely compute-bound, there is almost no disk activity! Looking at the output opcodes from the queries, I can't see any significant difference. Though I have to admit I'm by far no expert in vmdb opcodes... sqlite> explain select * from mapelements,mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and mapelements.ID=+mapelements_rtree.ID; 0|Trace|0|0|0|explain select * from mapelements,mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and mapelements.ID=+mapelements_rtree.ID;|00| 1|Integer|7900|1|0||00| 2|Integer|8000|2|0||00| 3|Integer|7900|3|0||00| 4|Integer|8000|4|0||00| 5|Goto|0|42|0||00| 6|SetNumColumns|0|11|0||00| 7|OpenRead|0|2|0||00| 8|VOpen|1|0|0|vtab:870B6C0:184B20|00| 9|Rewind|0|39|0||00| 10|SCopy|1|7|0||00| 11|SCopy|2|8|0||00| 12|SCopy|3|9|0||00| 13|SCopy|4|10|0||00| 14|Integer|2|5|0||00| 15|Integer|4|6|0||00| 16|VFilter|1|38|5|DbBaDdBc|00| 17|Rowid|0|11|0||00| 18|VColumn|1|0|12||00| 19|Ne|12|37|11|collseq(BINARY)|6c| 20|Rowid|0|13|0||00| 21|Column|0|1|14||00| 22|Column|0|2|15||00| 23|Column|0|3|16||00| 24|Column|0|4|17||00| 25|Column|0|5|18||00| 26|Column|0|6|19||00| 27|Column|0|7|20||00| 28|Column|0|8|21||00| 29|Column|0|9|22||00| 30|Column|0|10|23||00| 31|VColumn|1|0|24||00| 32|VColumn|1|1|25||00| 33|VColumn|1|2|26||00| 34|VColumn|1|3|27||00| 35|VColumn|1|4|28||00| 36|ResultRow|13|16|0||00| 37|VNext|1|17|0||00| 38|Next|0|10|0||00| 39|Close|0|0|0||00| 40|Close|1|0|0||00| 41|Halt|0|0|0||00| 42|Transaction|0|0|0||00| 43|VerifyCookie|0|17|0||00| 44|TableLock|0|2|0|mapelements|00| 45|Goto|0|6|0||00| sqlite> explain select * from mapelements where ID in (select ID from mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000); 0|Trace|0|0|0|explain select * from mapelements where ID in (select ID from mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000);|00| 1|Goto|0|44|0||00| 2|SetNumColumns|0|11|0||00| 3|OpenRead|0|2|0||00| 4|If|2|24|0||00| 5|Integer|1|2|0||00| 6|OpenEphemeral|3|1|0|keyinfo(1,BINARY)|00| 7|Integer|7900|3|0||00| 8|Integer|8000|4|0||00| 9|Integer|7900|5|0||00| 10|Integer|8000|6|0||00| 11|VOpen|1|0|0|vtab:80BA750:184B20|00| 12|SCopy|3|9|0||00| 13|SCopy|4|10|0||00| 14|SCopy|5|11|0||00| 15|SCopy|6|12|0||00| 16|Integer|2|7|0||00| 17|Integer|4|8|0||00| 18|VFilter|1|23|7|DbBaDdBc|00| 19|VColumn|1|0|13||00| 20|MakeRecord|13|1|14|c|00| 21|IdxInsert|3|14|0||00| 22|VNext|1|19|0||00| 23|Close|1|0|0||00| 24|Rewind|3|42|0||00| 25|Column|3|0|1||00| 26|IsNull|1|41|0||00| 27|MustBeInt|1|41|0||00| 28|NotExists|0|41|1||00| 29|Rowid|0|15|0||00| 30|Column|0|1|16||00| 31|Column|0|2|17||00| 32|Column|0|3|18||00| 33|Column|0|4|19||00| 34|Column|0|5|20||00| 35|Column|0|6|21||00| 36|Column|0|7|22||00| 37|Column|0|8|23||00| 38|Column|0|9|24||00| 39|Column|0|10|25||00| 40|ResultRow|15|11|0||00| 41|Next|3|25|0||00| 42|Close|0|0|0||00| 43|Halt|0|0|0||00| 44|Transaction|0|0|0||00| 45|VerifyCookie|0|17|0||00| 46|TableLock|0|2|0|mapelements|00| 47|Goto|0|2|0||00| Thanks, Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
Thomas Sailer wrote: > > The following query is very quick, it returns 20 rows within a small fraction > of a second: > select * from mapelements_rtree where NELAT>=7900 and SWLAT<=8000 and > NELON>=7900 and SWLON<=8000; > > The following query, however, takes a long time (almost half a minute): > select * from mapelements,mapelements_rtree where > mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and > mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and > mapelements.ID=mapelements_rtree.ID; > > This is basically how the rtree documentation suggests to perform > selects. Why is this query so slow, and what can I do to fix/workaround > this? > Does this query run faster? select * from mapelements where ID in ( select ID from mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 ); Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
Hello! В сообщении от Monday 18 August 2008 20:21:04 Thomas Sailer написал(а): > The following query is very quick, it returns 20 rows within a small > fraction of a second: select * from mapelements_rtree where NELAT>=7900 > and SWLAT<=8000 and NELON>=7900 and SWLON<=8000; > > The following query, however, takes a long time (almost half a minute): > select * from mapelements,mapelements_rtree where > mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and > mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and > mapelements.ID=mapelements_rtree.ID; Try the query: select * from mapelements where ID in (select ID from mapelements_rtree where NELAT>=7900 and SWLAT<=8000 and NELON>=7900 and SWLON<=8000); Best regards, Alexey. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
On Mon, 2008-08-18 at 12:28 -0400, D. Richard Hipp wrote: > Please try changing the last term as shown below (add a "+" before the > r-tree ID column): > > mapelements.ID = +mapelements_rtree.ID Thanks a lot for your quick answer! Unfortunately, it didn't help. What's interesting is that the first three rows are returned quickly (sub 1 second), but then there's a very long pause until the remaining rows are returned. Thanks, Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] rtree performance problems?
On Aug 18, 2008, at 12:21 PM, Thomas Sailer wrote: > > The following query, however, takes a long time (almost half a > minute): > select * from mapelements,mapelements_rtree where > mapelements_rtree.NELAT>=7900 and > mapelements_rtree.SWLAT<=8000 and > mapelements_rtree.NELON>=7900 and > mapelements_rtree.SWLON<=8000 and > mapelements.ID=mapelements_rtree.ID; Please try changing the last term as shown below (add a "+" before the r-tree ID column): mapelements.ID = +mapelements_rtree.ID Let us know what happens to performance. D. Richard Hipp [EMAIL PROTECTED] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] rtree performance problems?
I have a database with the following schema: CREATE TABLE mapelements (ID INTEGER PRIMARY KEY NOT NULL, TYPECODE INTEGER,NAME TEXT,LAT INTEGER,LON INTEGER,SWLAT INTEGER,SWLON INTEGER,NELAT INTEGER,NELON INTEGER,POLY BLOB,TILE INTEGER); CREATE VIRTUAL TABLE mapelements_rtree USING rtree(ID,SWLAT,NELAT,SWLON,NELON); CREATE TABLE "mapelements_rtree_node"(nodeno INTEGER PRIMARY KEY, data BLOB); CREATE TABLE "mapelements_rtree_parent"(nodeno INTEGER PRIMARY KEY, parentnode INTEGER); CREATE TABLE "mapelements_rtree_rowid"(rowid INTEGER PRIMARY KEY, nodeno INTEGER); CREATE INDEX mapelements_bbox ON mapelements(SWLAT,NELAT,SWLON,NELON); CREATE INDEX mapelements_latlon ON mapelements(LAT,LON); CREATE INDEX mapelements_name ON mapelements(NAME COLLATE NOCASE); CREATE INDEX mapelements_nelat ON mapelements(NELAT); CREATE INDEX mapelements_nelon ON mapelements(NELON); CREATE INDEX mapelements_swlat ON mapelements(SWLAT); CREATE INDEX mapelements_swlon ON mapelements(SWLON); CREATE INDEX mapelements_tile ON mapelements(TILE); The database contains 2653693 rows, and is approximately 830MBytes big without the rtree index, and approximately 970MBytes with the rtree index. I have created the rtree index with the following script: CREATE VIRTUAL TABLE mapelements_rtree USING rtree(ID,SWLAT,NELAT,SWLON,NELON); INSERT INTO mapelements_rtree SELECT ID,SWLAT,NELAT,SWLON,NELON FROM mapelements; analyze; vacuum; The following query is very quick, it returns 20 rows within a small fraction of a second: select * from mapelements_rtree where NELAT>=7900 and SWLAT<=8000 and NELON>=7900 and SWLON<=8000; The following query, however, takes a long time (almost half a minute): select * from mapelements,mapelements_rtree where mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and mapelements.ID=mapelements_rtree.ID; This is basically how the rtree documentation suggests to perform selects. Why is this query so slow, and what can I do to fix/workaround this? Thanks, Tom ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users