Re: [sqlite] table format for most efficient query

2013-07-05 Thread RSmith

Sorry - thats what I mean - the table (as redefined by simon)

create table lookup (rowindex int, rangestart int, rangeend int)

values in rowindex are not unique so I cannot make this column a primary
key so i am not sure whether RTrees would be any help in this scenario


Hi Paul,
Firstly, the RTRee will work spectacularly well for your case, but you are in no way bound by the rowid or primary keys to conform 
to your row id, but if your row ids are not unique, then you need to simply use a different key for that function.

For instance, 2 tables, one RTree table witht he standard Primary key + start + 
end range columns,
Second your data table with Key, rowindex, field3, field4, etc. starting with a Key which should be a primary key or (best be 
INTEGER PRIMARY KEY) or a foreign key on the Primary Key of the RTree table (I wont go into foreign key relationships here, you can 
read it on the SQLite web pages, it's only an option, you can very well simply refer the id's in Select queries).
The second column being your rowindex which may very well be non-unique and void of all other restrictions even, followed by the 
rest of however many data columns are needed.


Think of the RTree table as a sort of ranged-Index to your data table. So you 
can do queries like:
SELECT * from myData AS D JOIN rtreeTable AS R ON (R.PrimKey=D.Key) WHERE 
(Range-select-values);

This will return all the data in your data table (which may very well again re-include the actual ranged values if space is not a 
concern) according to the lookup index IDs supplied from the RTree efficient search algorithm.


This select above can take many other forms but as long as you are looking for a range specifier, the Rtree table will be much 
faster on the lookup, and it returns Key alues from an Index, which is no effort to simply look up for the data table. Also, never 
be afraid of using an additional index, the more ways SQLite can find your data, the better it should be at picking the fastest.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-05 Thread Igor Tandetnik

On 7/5/2013 10:59 AM, Paul Sanderson wrote:

Sorry - thats what I mean - the table (as redefined by simon)

create table lookup (rowindex int, rangestart int, rangeend int)

values in rowindex are not unique so I cannot make this column a primary
key so i am not sure whether RTrees would be any help in this scenario


create virtual table lookup_tree using rtree_i32(rangeid, rangestart, 
rangeend);

create table lookup(rangeid integer primary key, rowindex int);

select * from
  master m join lookup l on (m.rowindex = l.rowindex)
  join lookup_tree lt on (l.rangeid=lt.rangeid)
where :x >= rangestart and and :x < rangeend;

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-05 Thread Paul Sanderson
Sorry - thats what I mean - the table (as redefined by simon)

create table lookup (rowindex int, rangestart int, rangeend int)

values in rowindex are not unique so I cannot make this column a primary
key so i am not sure whether RTrees would be any help in this scenario


On 5 July 2013 14:39, Igor Tandetnik  wrote:

> On 7/5/2013 9:13 AM, Paul Sanderson wrote:
>
>> my primary key would not be unique :(
>>
>
> That's an oxymoron - primary key is unique, by definition.
>
> --
> Igor Tandetnik
>
> __**_
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-05 Thread Igor Tandetnik

On 7/5/2013 9:13 AM, Paul Sanderson wrote:

my primary key would not be unique :(


That's an oxymoron - primary key is unique, by definition.
--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-05 Thread Paul Sanderson
Having read up on RTrees I may have a problem - my primary key would not be
unique :(


On 5 July 2013 12:05, Paul Sanderson  wrote:

> Thanks for that Igor - I had RTree in the back of my mine but couldn't
> remember what or where I had read about it.
>
> before I compile a dll (or someone provides a link to a 32 bit dl with
> RTree enabled) is RTree likely to be useful in the following scenario?
>
> Each look up will usually, but not always, be successful - i.e. about 90%
> of the time a search will result in about 4 rows being returned the other
> 10% of the time nothing will be found
>
> Just sitting down to look at Simons suggestions.
>
>
>
>
> On 5 July 2013 01:47, Igor Tandetnik  wrote:
>
>> On 7/4/2013 5:29 PM, Paul Sanderson wrote:
>>
>>> select * from master as m, lookup as l where x >= start and and x < end
>>> and
>>> m.index = l.index
>>>
>>
>> You might want to look at the RTree module:
>>
>> http://www.sqlite.org/rtree.**html 
>>
>> It's specifically designed to implement such range queries efficiently.
>> --
>> Igor Tandetnik
>>
>>
>> __**_
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**users
>>
>
>
>
>
>


-- 
Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?168-Reconnoitre - VSC
processing made easy
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-05 Thread Paul Sanderson
Thanks for that Igor - I had RTree in the back of my mine but couldn't
remember what or where I had read about it.

before I compile a dll (or someone provides a link to a 32 bit dl with
RTree enabled) is RTree likely to be useful in the following scenario?

Each look up will usually, but not always, be successful - i.e. about 90%
of the time a search will result in about 4 rows being returned the other
10% of the time nothing will be found

Just sitting down to look at Simons suggestions.



On 5 July 2013 01:47, Igor Tandetnik  wrote:

> On 7/4/2013 5:29 PM, Paul Sanderson wrote:
>
>> select * from master as m, lookup as l where x >= start and and x < end
>> and
>> m.index = l.index
>>
>
> You might want to look at the RTree module:
>
> http://www.sqlite.org/rtree.**html 
>
> It's specifically designed to implement such range queries efficiently.
> --
> Igor Tandetnik
>
>
> __**_
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-**bin/mailman/listinfo/sqlite-**users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-04 Thread Igor Tandetnik

On 7/4/2013 5:29 PM, Paul Sanderson wrote:

select * from master as m, lookup as l where x >= start and and x < end and
m.index = l.index


You might want to look at the RTree module:

http://www.sqlite.org/rtree.html

It's specifically designed to implement such range queries efficiently.
--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-04 Thread Kees Nuyt
On Thu, 4 Jul 2013 22:52:26 +0100, Simon Slavin 
wrote:

>
> I assume you missed a comma:
>
> create table lookup (index int, start int, end int)

indeed

> But actually it’s a bad idea to use the words
> 'index' and 'end’ for columns because they're
> used as reserved words in SQL. 

I agree.

> So try something like
> create table lookup (rowindex int, rangestart int, rangeend int)

Even better:

create table lookup (
rowindex INTEGER PRIMARY KEY
,   rangestart int
,   rangeend int
);

This way, rowindex aliases the internal ROWID column, saving an integer
column. Also, JOIN performance on rowindex will be better as it removes
one level of indirection.

-- 
Groet, Cordialement, Pozdrawiam, Regards,

Kees Nuyt

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-04 Thread Paul Sanderson
Thanks Simon - i'll have a play. tomorrow




On 4 July 2013 22:52, Simon Slavin  wrote:

>
> On 4 Jul 2013, at 10:29pm, Paul Sanderson 
> wrote:
>
> > create table lookup (index int, start int end int)
>
> I assume you missed a comma:
>
> create table lookup (index int, start int, end int)
>
> But actually it’s a bad idea to use the words 'index' and 'end’ for
> columns because they're used as reserved words in SQL.  So try something
> like
>
> create table lookup (rowindex int, rangestart int, rangeend int)
>
> > The takle will be joined on a second table via the index column
> >
> > the table is likely to have a few million rows and I will be doing many
> > thousands of lookups consequtively. My current lookups are of the form
> >
> > select * from master as m, lookup as l where x >= start and and x < end
> and
> > m.index = l.index
>
> Presumably
>
> select * from master as m, lookup as l where m.x >= l.rangestart and m.x <
> l.rangeend and m.rowindex = l.rowindex
>
> > i sthere an efficient way of doing this - 30,000 lookups on two table
> seach
> > with about 1 million rows is taking about 30 minutes.
> >
> > Can I structure my lookup table differently?
>
> Don’t do a * unless you actually need * of both tables.  List the columns
> you are actually going to use.
>
> > would indexes help in anyway?
>
> SELECT * FROM master AS l JOIN lookup AS l ON master.rowindex =
> lookup.rowindex WHERE m.x >= l.rangestart AND m.x < l.rangeend
>
> Good indexes for that would be
>
> CREATE INDEX masterRX ON master (rowindex, x)
> CREATE INDEX lookupRR ON lookup (rowindex, rangestart)
>
> You can speed things up even more by doing an ANALYSE once you have data
> in the tables.  This lets SQLite figure out that your lookup table is huge
> and tune its query plans accordingly.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] table format for most efficient query

2013-07-04 Thread Simon Slavin

On 4 Jul 2013, at 10:29pm, Paul Sanderson  wrote:

> create table lookup (index int, start int end int)

I assume you missed a comma:

create table lookup (index int, start int, end int)

But actually it’s a bad idea to use the words 'index' and 'end’ for columns 
because they're used as reserved words in SQL.  So try something like

create table lookup (rowindex int, rangestart int, rangeend int)

> The takle will be joined on a second table via the index column
> 
> the table is likely to have a few million rows and I will be doing many
> thousands of lookups consequtively. My current lookups are of the form
> 
> select * from master as m, lookup as l where x >= start and and x < end and
> m.index = l.index

Presumably

select * from master as m, lookup as l where m.x >= l.rangestart and m.x < 
l.rangeend and m.rowindex = l.rowindex

> i sthere an efficient way of doing this - 30,000 lookups on two table seach
> with about 1 million rows is taking about 30 minutes.
> 
> Can I structure my lookup table differently?

Don’t do a * unless you actually need * of both tables.  List the columns you 
are actually going to use.

> would indexes help in anyway?

SELECT * FROM master AS l JOIN lookup AS l ON master.rowindex = lookup.rowindex 
WHERE m.x >= l.rangestart AND m.x < l.rangeend

Good indexes for that would be

CREATE INDEX masterRX ON master (rowindex, x)
CREATE INDEX lookupRR ON lookup (rowindex, rangestart)

You can speed things up even more by doing an ANALYSE once you have data in the 
tables.  This lets SQLite figure out that your lookup table is huge and tune 
its query plans accordingly.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users