Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-22 Thread Peng Yu
On Sat, Aug 21, 2010 at 10:45 PM, ve3meo  wrote:
> "Eric Smith" ...
>> I haven't used it myself, but I'm pretty sure this is what the R*tree
>> module was designed for:
>
> I have not used it either but was intrigued by your suggestion. Looking into
> it, my sense was that it would be advantageous for a 2-dimension or more
> search and I think this was borne out by my experiments, albeit with my very
> limited knowledge and experience. I hasten to add that maybe my queries were
> not the best designed.
>
> I created an R-tree virtual table with the minimum number of columns: id,
> xmin, xmax and populated it with the rowid from Peng's table A and
> position-10 in xmin and position+10 in xmax. Thus any value lying between
> xmax and xmin is no more than a distance of 10 from the point pointed to in
> Table A by id. Indeed, a simple select on a single value between xmax and
> xmin returned rows from the virtual table faster than the correspondingly
> simple select on the real table with an index on position (something like
> 6ms vs. 10 ms).
>
> However, on the 10,000 row test table on which I reported earlier that Jim's
> 'between' query was fastest at ~2.3s, the best I could get by working an
> R-tree virtual table into the mix was ~30s, and that with an index on Name
> or Name+Position. Without the index... ~60s, not much better than Peng's
> original indexless query and way more complicated.
>
> I think the penalty is in the extra JOIN required - 3 tables instead of 2 -
> with the speed advantage on the 'between' constraint being swamped by the
> volume of intermediate rows.

I don't quite understand why there are 3 tables with R-tree. Would you
please show me what query you used? Thank you very much.

-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-21 Thread ve3meo
"Eric Smith" ...
> I haven't used it myself, but I'm pretty sure this is what the R*tree
> module was designed for:

I have not used it either but was intrigued by your suggestion. Looking into 
it, my sense was that it would be advantageous for a 2-dimension or more 
search and I think this was borne out by my experiments, albeit with my very 
limited knowledge and experience. I hasten to add that maybe my queries were 
not the best designed.

I created an R-tree virtual table with the minimum number of columns: id, 
xmin, xmax and populated it with the rowid from Peng's table A and 
position-10 in xmin and position+10 in xmax. Thus any value lying between 
xmax and xmin is no more than a distance of 10 from the point pointed to in 
Table A by id. Indeed, a simple select on a single value between xmax and 
xmin returned rows from the virtual table faster than the correspondingly 
simple select on the real table with an index on position (something like 
6ms vs. 10 ms).

However, on the 10,000 row test table on which I reported earlier that Jim's 
'between' query was fastest at ~2.3s, the best I could get by working an 
R-tree virtual table into the mix was ~30s, and that with an index on Name 
or Name+Position. Without the index... ~60s, not much better than Peng's 
original indexless query and way more complicated.

I think the penalty is in the extra JOIN required - 3 tables instead of 2 - 
with the speed advantage on the 'between' constraint being swamped by the 
volume of intermediate rows.

Tom 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-21 Thread Peng Yu
Thank you! I don't find which sqlite document describes 'between ...
and ...'. Would you please point to me where it is documented?

On Fri, Aug 20, 2010 at 6:06 PM, Jim Morris  wrote:
>  If there is an index on (name, position) the a where like below might
> use it.
>
> A1.name=A2.name and A2.position between( A1.position - 10, A1.position + 10 )
>
>
> On 8/20/2010 3:54 PM, Peng Yu wrote:
>> Hi,
>>
>> I have the following code to search for neighboring positions
>> (distance<=10). But it is slow for large data set. I'm wondering what
>> is the most efficient query for such a search. Note that I don't
>> create an index, as I'm not sure what index to create on table A.
>>
>> $ cat main.sql
>> #!/usr/bin/env bash
>>
>> rm -f main.db
>> sqlite3 main.db<>
>> create table A (name text, position integer);
>> insert into A values('a', 1);
>> insert into A values('a', 5);
>> insert into A values('a', 21);
>> insert into A values('b', 3);
>> insert into A values('b', 15);
>> insert into A values('b', 19);
>>
>> .mode column
>> .headers on
>> .echo on
>> select * from A as A1, A as A2 where A1.name=A2.name and
>> abs(A1.position - A2.position)<= 10 and A1.position != A2.position;
>>
>> EOF
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Regards,
Peng
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-20 Thread Eric Smith
Peng Yu wrote:

> I have the following code to search for neighboring positions
> (distance <=10). But it is slow for large data set. I'm wondering what
> is the most efficient query for such a search. Note that I don't
> create an index, as I'm not sure what index to create on table A.

I haven't used it myself, but I'm pretty sure this is what the R*tree 
module was designed for:

http://www.sqlite.org/rtree.html

--
Eric A. Smith

Drill for oil? You mean drill into the ground to try and find oil? You're
crazy.
-- Drillers who Edwin L. Drake tried to enlist to his project 
   to drill for oil in 1859.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-20 Thread Simon Slavin

On 20 Aug 2010, at 11:54pm, Peng Yu wrote:

> select * from A as A1, A as A2 where A1.name=A2.name and
> abs(A1.position - A2.position) <= 10 and A1.position != A2.position;

If you're doing this a lot, work out which chunks of 20 each point is in.  You 
only need to compare a point with points in two chunks of 20: the one below and 
the one above.  So you can reject almost all the points immediately.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] What is the most efficient way to get the close by numbers?

2010-08-20 Thread Jim Morris
  If there is an index on (name, position) the a where like below might 
use it.

A1.name=A2.name and A2.position between( A1.position - 10, A1.position + 10 )


On 8/20/2010 3:54 PM, Peng Yu wrote:
> Hi,
>
> I have the following code to search for neighboring positions
> (distance<=10). But it is slow for large data set. I'm wondering what
> is the most efficient query for such a search. Note that I don't
> create an index, as I'm not sure what index to create on table A.
>
> $ cat main.sql
> #!/usr/bin/env bash
>
> rm -f main.db
> sqlite3 main.db<
> create table A (name text, position integer);
> insert into A values('a', 1);
> insert into A values('a', 5);
> insert into A values('a', 21);
> insert into A values('b', 3);
> insert into A values('b', 15);
> insert into A values('b', 19);
>
> .mode column
> .headers on
> .echo on
> select * from A as A1, A as A2 where A1.name=A2.name and
> abs(A1.position - A2.position)<= 10 and A1.position != A2.position;
>
> EOF
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] What is the most efficient way to get the close by numbers?

2010-08-20 Thread Peng Yu
Hi,

I have the following code to search for neighboring positions
(distance <=10). But it is slow for large data set. I'm wondering what
is the most efficient query for such a search. Note that I don't
create an index, as I'm not sure what index to create on table A.

$ cat main.sql
#!/usr/bin/env bash

rm -f main.db
sqlite3 main.db