Re: [sqlite] rtree performance problems?

2008-09-10 Thread Scott Hess
I'll have to look again.  I was sure I had mucked with estimatedCost
and found that it wasn't doing the trick for what I meant to do - but
somehow the context has swapped out of my short-term memory, so I'll
have to recreate it before I can say why this was the case.

-scott


On Wed, Sep 10, 2008 at 1:43 PM, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
>
> see sqlite3_index_info.estimatedCost
>
> http://www.sqlite.org/cvstrac/chngview?cn=5649
>
>> AFAICT, when you have a join where one table has a good index, the
>> virtual table cannot signal that it has an even better index.  I could
>> not follow the index-selection logic well enough to have any
>> suggestions how to improve things.  In the fts case, this breaks the
>> query because the MATCH operator _only_ works when used to access the
>> index.
>>
>> -scott
>
> --
> View this message in context: 
> http://www.nabble.com/rtree-performance-problems--tp19035092p19422728.html
> Sent from the SQLite mailing list archive at Nabble.com.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-09-10 Thread [EMAIL PROTECTED]

see sqlite3_index_info.estimatedCost

http://www.sqlite.org/cvstrac/chngview?cn=5649

> AFAICT, when you have a join where one table has a good index, the
> virtual table cannot signal that it has an even better index.  I could
> not follow the index-selection logic well enough to have any
> suggestions how to improve things.  In the fts case, this breaks the
> query because the MATCH operator _only_ works when used to access the
> index.
> 
> -scott

-- 
View this message in context: 
http://www.nabble.com/rtree-performance-problems--tp19035092p19422728.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-09-10 Thread Scott Hess
[Sorry for the blast from the past.]

I think this class of problem does also happen on fts, there was a
thread on August 7 on sqlite-dev about it.  Unfortunately, I don't see
any open-access web-mirrors of that list to refer to, but here's a
members-only ref:
http://sqlite.org:8080/cgi-bin/mailman/private/sqlite-dev/2008-August/000330.html

AFAICT, when you have a join where one table has a good index, the
virtual table cannot signal that it has an even better index.  I could
not follow the index-selection logic well enough to have any
suggestions how to improve things.  In the fts case, this breaks the
query because the MATCH operator _only_ works when used to access the
index.

-scott


On Tue, Aug 19, 2008 at 12:31 PM, Hartwig Wiesmann
<[EMAIL PROTECTED]> wrote:
>
> Am 18.08.2008 um 21:32 schrieb Dennis Cote:
>
>> Thomas Sailer wrote:
>>>
>>> Interestingly, the original query is extremely compute-bound, there
>>> is
>>> almost no disk activity!
>>>
>>> Looking at the output opcodes from the queries, I can't see any
>>> significant difference. Though I have to admit I'm by far no expert
>>> in
>>> vmdb opcodes...
>>>
>>
>> You can use "explain query plan " to get some clues as to
>> what is happening, but the vdbe code shows it best.
>>
>> The original query is doing a full table scan through all 2.6 million
>> records in the main table, and for each record it uses the rtree to
>> locate the 20 records inside the rectangle set by your limits, then it
>> compares the id of each of these records to see if it matches the id
>> of
>> the main table record. For each match it dumps all the data in both
>> records.
>>
>> The second query use the rtree index to locate the 20 records in the
>> limit rectangle, and then uses the btree index to locate those 20
>> records quickly, and then dumps the data for that record.
>>
>> HTH
>> Dennis Cote
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> Hello,
>
> I had an identical experience a few days ago and posted it under a
> different topic. Because SQLite is not doing well as mentioned in the
> example of the documentation I suggest to change the documentation
> using the proposed solution. I also had to find it out the hard way
> and I think by modifying the documentation there are at least a few
> people less who have to go that way.
>
> BTW: I suppose the same problem occurs when using fts?
>
> Hartwig
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-19 Thread Hartwig Wiesmann

Am 18.08.2008 um 21:32 schrieb Dennis Cote:

> Thomas Sailer wrote:
>>
>> Interestingly, the original query is extremely compute-bound, there  
>> is
>> almost no disk activity!
>>
>> Looking at the output opcodes from the queries, I can't see any
>> significant difference. Though I have to admit I'm by far no expert  
>> in
>> vmdb opcodes...
>>
>
> You can use "explain query plan " to get some clues as to
> what is happening, but the vdbe code shows it best.
>
> The original query is doing a full table scan through all 2.6 million
> records in the main table, and for each record it uses the rtree to
> locate the 20 records inside the rectangle set by your limits, then it
> compares the id of each of these records to see if it matches the id  
> of
> the main table record. For each match it dumps all the data in both
> records.
>
> The second query use the rtree index to locate the 20 records in the
> limit rectangle, and then uses the btree index to locate those 20
> records quickly, and then dumps the data for that record.
>
> HTH
> Dennis Cote
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
Hello,

I had an identical experience a few days ago and posted it under a  
different topic. Because SQLite is not doing well as mentioned in the  
example of the documentation I suggest to change the documentation  
using the proposed solution. I also had to find it out the hard way  
and I think by modifying the documentation there are at least a few  
people less who have to go that way.

BTW: I suppose the same problem occurs when using fts?

Hartwig


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread Dennis Cote
Thomas Sailer wrote:
> 
> Interestingly, the original query is extremely compute-bound, there is
> almost no disk activity!
> 
> Looking at the output opcodes from the queries, I can't see any
> significant difference. Though I have to admit I'm by far no expert in
> vmdb opcodes...
> 

You can use "explain query plan " to get some clues as to 
what is happening, but the vdbe code shows it best.

The original query is doing a full table scan through all 2.6 million 
records in the main table, and for each record it uses the rtree to 
locate the 20 records inside the rectangle set by your limits, then it 
compares the id of each of these records to see if it matches the id of 
the main table record. For each match it dumps all the data in both 
records.

The second query use the rtree index to locate the 20 records in the 
limit rectangle, and then uses the btree index to locate those 20 
records quickly, and then dumps the data for that record.

HTH
Dennis Cote
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread Thomas Sailer
On Mon, 2008-08-18 at 11:16 -0600, Dennis Cote wrote:

> Does this query run faster?
> 
> select * from mapelements
> where ID in
>  (
>  select ID from mapelements_rtree
>  where mapelements_rtree.NELAT>=7900
>  and mapelements_rtree.SWLAT<=8000
>  and mapelements_rtree.NELON>=7900
>  and mapelements_rtree.SWLON<=8000
>  );

Yes; this query terminates in less than a second. Technically it's not
quite the same thing (doesn't return the rtree columns), but it's what I
want :-)

Interestingly, the original query is extremely compute-bound, there is
almost no disk activity!

Looking at the output opcodes from the queries, I can't see any
significant difference. Though I have to admit I'm by far no expert in
vmdb opcodes...

sqlite> explain select * from mapelements,mapelements_rtree where 
mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and 
mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and 
mapelements.ID=+mapelements_rtree.ID;
0|Trace|0|0|0|explain select * from mapelements,mapelements_rtree where 
mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and 
mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and 
mapelements.ID=+mapelements_rtree.ID;|00|
1|Integer|7900|1|0||00|
2|Integer|8000|2|0||00|
3|Integer|7900|3|0||00|
4|Integer|8000|4|0||00|
5|Goto|0|42|0||00|
6|SetNumColumns|0|11|0||00|
7|OpenRead|0|2|0||00|
8|VOpen|1|0|0|vtab:870B6C0:184B20|00|
9|Rewind|0|39|0||00|
10|SCopy|1|7|0||00|
11|SCopy|2|8|0||00|
12|SCopy|3|9|0||00|
13|SCopy|4|10|0||00|
14|Integer|2|5|0||00|
15|Integer|4|6|0||00|
16|VFilter|1|38|5|DbBaDdBc|00|
17|Rowid|0|11|0||00|
18|VColumn|1|0|12||00|
19|Ne|12|37|11|collseq(BINARY)|6c|
20|Rowid|0|13|0||00|
21|Column|0|1|14||00|
22|Column|0|2|15||00|
23|Column|0|3|16||00|
24|Column|0|4|17||00|
25|Column|0|5|18||00|
26|Column|0|6|19||00|
27|Column|0|7|20||00|
28|Column|0|8|21||00|
29|Column|0|9|22||00|
30|Column|0|10|23||00|
31|VColumn|1|0|24||00|
32|VColumn|1|1|25||00|
33|VColumn|1|2|26||00|
34|VColumn|1|3|27||00|
35|VColumn|1|4|28||00|
36|ResultRow|13|16|0||00|
37|VNext|1|17|0||00|
38|Next|0|10|0||00|
39|Close|0|0|0||00|
40|Close|1|0|0||00|
41|Halt|0|0|0||00|
42|Transaction|0|0|0||00|
43|VerifyCookie|0|17|0||00|
44|TableLock|0|2|0|mapelements|00|
45|Goto|0|6|0||00|

sqlite> explain select * from mapelements where ID in (select ID from 
mapelements_rtree where mapelements_rtree.NELAT>=7900 and 
mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and 
mapelements_rtree.SWLON<=8000);
0|Trace|0|0|0|explain select * from mapelements where ID in (select ID from 
mapelements_rtree where mapelements_rtree.NELAT>=7900 and 
mapelements_rtree.SWLAT<=8000 and mapelements_rtree.NELON>=7900 and 
mapelements_rtree.SWLON<=8000);|00|
1|Goto|0|44|0||00|
2|SetNumColumns|0|11|0||00|
3|OpenRead|0|2|0||00|
4|If|2|24|0||00|
5|Integer|1|2|0||00|
6|OpenEphemeral|3|1|0|keyinfo(1,BINARY)|00|
7|Integer|7900|3|0||00|
8|Integer|8000|4|0||00|
9|Integer|7900|5|0||00|
10|Integer|8000|6|0||00|
11|VOpen|1|0|0|vtab:80BA750:184B20|00|
12|SCopy|3|9|0||00|
13|SCopy|4|10|0||00|
14|SCopy|5|11|0||00|
15|SCopy|6|12|0||00|
16|Integer|2|7|0||00|
17|Integer|4|8|0||00|
18|VFilter|1|23|7|DbBaDdBc|00|
19|VColumn|1|0|13||00|
20|MakeRecord|13|1|14|c|00|
21|IdxInsert|3|14|0||00|
22|VNext|1|19|0||00|
23|Close|1|0|0||00|
24|Rewind|3|42|0||00|
25|Column|3|0|1||00|
26|IsNull|1|41|0||00|
27|MustBeInt|1|41|0||00|
28|NotExists|0|41|1||00|
29|Rowid|0|15|0||00|
30|Column|0|1|16||00|
31|Column|0|2|17||00|
32|Column|0|3|18||00|
33|Column|0|4|19||00|
34|Column|0|5|20||00|
35|Column|0|6|21||00|
36|Column|0|7|22||00|
37|Column|0|8|23||00|
38|Column|0|9|24||00|
39|Column|0|10|25||00|
40|ResultRow|15|11|0||00|
41|Next|3|25|0||00|
42|Close|0|0|0||00|
43|Halt|0|0|0||00|
44|Transaction|0|0|0||00|
45|VerifyCookie|0|17|0||00|
46|TableLock|0|2|0|mapelements|00|
47|Goto|0|2|0||00|

Thanks,
Tom


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread Dennis Cote
Thomas Sailer wrote:
> 
> The following query is very quick, it returns 20 rows within a small fraction 
> of a second:
> select * from mapelements_rtree where NELAT>=7900 and SWLAT<=8000 and 
> NELON>=7900 and SWLON<=8000;
> 
> The following query, however, takes a long time (almost half a minute):
> select * from mapelements,mapelements_rtree where 
> mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and 
> mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and 
> mapelements.ID=mapelements_rtree.ID;
> 
> This is basically how the rtree documentation suggests to perform
> selects. Why is this query so slow, and what can I do to fix/workaround
> this?
> 

Does this query run faster?

select * from mapelements
where ID in
 (
 select ID from mapelements_rtree
 where mapelements_rtree.NELAT>=7900
 and mapelements_rtree.SWLAT<=8000
 and mapelements_rtree.NELON>=7900
 and mapelements_rtree.SWLON<=8000
 );

Dennis Cote
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread Alexey Pechnikov
Hello!

В сообщении от Monday 18 August 2008 20:21:04 Thomas Sailer написал(а):
> The following query is very quick, it returns 20 rows within a small
> fraction of a second: select * from mapelements_rtree where NELAT>=7900
> and SWLAT<=8000 and NELON>=7900 and SWLON<=8000;
>
> The following query, however, takes a long time (almost half a minute):
> select * from mapelements,mapelements_rtree where
> mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and
> mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and
> mapelements.ID=mapelements_rtree.ID;

Try the query:

select * from mapelements where ID in (select ID from mapelements_rtree where 
NELAT>=7900 and SWLAT<=8000 and NELON>=7900 and SWLON<=8000);


Best regards, Alexey.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread Thomas Sailer
On Mon, 2008-08-18 at 12:28 -0400, D. Richard Hipp wrote:

> Please try changing the last term as shown below (add a "+" before the  
> r-tree ID column):
> 
>   mapelements.ID = +mapelements_rtree.ID

Thanks a lot for your quick answer! Unfortunately, it didn't help.

What's interesting is that the first three rows are returned quickly
(sub 1 second), but then there's a very long pause until the remaining
rows are returned.

Thanks,
Tom


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] rtree performance problems?

2008-08-18 Thread D. Richard Hipp

On Aug 18, 2008, at 12:21 PM, Thomas Sailer wrote:
>
> The following query, however, takes a long time (almost half a  
> minute):
> select * from mapelements,mapelements_rtree where  
> mapelements_rtree.NELAT>=7900 and  
> mapelements_rtree.SWLAT<=8000 and  
> mapelements_rtree.NELON>=7900 and  
> mapelements_rtree.SWLON<=8000 and  
> mapelements.ID=mapelements_rtree.ID;


Please try changing the last term as shown below (add a "+" before the  
r-tree ID column):

  mapelements.ID = +mapelements_rtree.ID

Let us know what happens to performance.


D. Richard Hipp
[EMAIL PROTECTED]



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] rtree performance problems?

2008-08-18 Thread Thomas Sailer
I have a database with the following schema:

CREATE TABLE mapelements (ID INTEGER PRIMARY KEY NOT NULL, TYPECODE 
INTEGER,NAME TEXT,LAT INTEGER,LON INTEGER,SWLAT INTEGER,SWLON INTEGER,NELAT 
INTEGER,NELON INTEGER,POLY BLOB,TILE INTEGER);
CREATE VIRTUAL TABLE mapelements_rtree USING rtree(ID,SWLAT,NELAT,SWLON,NELON);
CREATE TABLE "mapelements_rtree_node"(nodeno INTEGER PRIMARY KEY, data BLOB);
CREATE TABLE "mapelements_rtree_parent"(nodeno INTEGER PRIMARY KEY, parentnode 
INTEGER);
CREATE TABLE "mapelements_rtree_rowid"(rowid INTEGER PRIMARY KEY, nodeno 
INTEGER);
CREATE INDEX mapelements_bbox ON mapelements(SWLAT,NELAT,SWLON,NELON);
CREATE INDEX mapelements_latlon ON mapelements(LAT,LON);
CREATE INDEX mapelements_name ON mapelements(NAME COLLATE NOCASE);
CREATE INDEX mapelements_nelat ON mapelements(NELAT);
CREATE INDEX mapelements_nelon ON mapelements(NELON);
CREATE INDEX mapelements_swlat ON mapelements(SWLAT);
CREATE INDEX mapelements_swlon ON mapelements(SWLON);
CREATE INDEX mapelements_tile ON mapelements(TILE);

The database contains 2653693 rows, and is approximately 830MBytes big
without the rtree index, and approximately 970MBytes with the rtree
index.

I have created the rtree index with the following script:
CREATE VIRTUAL TABLE mapelements_rtree USING rtree(ID,SWLAT,NELAT,SWLON,NELON);
INSERT INTO mapelements_rtree SELECT ID,SWLAT,NELAT,SWLON,NELON FROM 
mapelements;
analyze;
vacuum;

The following query is very quick, it returns 20 rows within a small fraction 
of a second:
select * from mapelements_rtree where NELAT>=7900 and SWLAT<=8000 and 
NELON>=7900 and SWLON<=8000;

The following query, however, takes a long time (almost half a minute):
select * from mapelements,mapelements_rtree where 
mapelements_rtree.NELAT>=7900 and mapelements_rtree.SWLAT<=8000 and 
mapelements_rtree.NELON>=7900 and mapelements_rtree.SWLON<=8000 and 
mapelements.ID=mapelements_rtree.ID;

This is basically how the rtree documentation suggests to perform
selects. Why is this query so slow, and what can I do to fix/workaround
this?

Thanks,
Tom


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users