Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-29 Thread curmudgeon
*I’m now wondering if you omit the WHERE & ORDER BY and run the following 

EXPLAIN QUERY PLAN 
SELECT BaseTbl.RowID 
FROM BaseTbl 
left join Tbl1 on comparison_1 
left join Tbl2 on comparison_2 
. 
. 
left join Tbln on comparison_n 

then if it returns more than 1 row then this implies there’s a 1 to many 
relationship in the query and we can’t proceed.  (At this stage I’m by no 
means sure of this). 
*

The above isn't guaranteed to work. Consider the following

CREATE TABLE t1(a INTEGER PRIMARY KEY, b INTEGER);
CREATE TABLE t2(x INTEGER PRIMARY KEY, y INTEGER);
CREATE INDEX ib ON t1(b);
EXPLAIN QUERY PLAN SELECT t1.RowID FROM t1 LEFT JOIN t2 ON x = b;
0|0|0|SCAN TABLE t1

Everything above is as expected, t2.x is primary key lookup for t1.b so
trailing left join is dropped. Suppose though we made t2.y the lookup for
t1.b 

CREATE UNIQUE INDEX iy ON t2(y);
EXPLAIN QUERY PLAN SELECT t1.RowID FROM t1 LEFT JOIN t2 ON y = b;
0|0|0|SCAN TABLE t1
0|1|1|SEARCH TABLE t2 USING COVERING INDEX iy (y=?)

As you can see what at first sight looks like a redundant trailing left join
ISN’T dropped. If however we had defined t2 with

CREATE TABLE t2(x INTEGER PRIMARY KEY, y INTEGER NOT NULL);

Then we would get

EXPLAIN QUERY PLAN SELECT t1.RowID FROM t1 LEFT JOIN t2 ON y = b;
0|0|0|SCAN TABLE t1

Therefore, the suggested formula would fail on any non premium key lookup
index where the index field(s) were not declared as NOT NULL.

Strangely (or maybe not)

EXPLAIN QUERY PLAN SELECT t1.RowID FROM t1 LEFT JOIN t2 ON y = b *WHERE y IS
NOT NULL*;

doesn't resolve the problem.




--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-29 Thread curmudgeon
E.Pasma wrote

>> What about changing the remaining inner join to left join
>
>> Select BaseTbl.RowID
>> from BaseTbl
>> left join Tbl_2 on Tbl2.Y = BaseTbl.Y
>> where BaseTbl.Col=?
>
>> and see if the SQLiter optimizer now leaves Tbl_2 out from the query
>> plan.  It will only do that if it is not a 1-to-n join.

I replied

> If Tbl_2 isn’t involved in the columns, where or order by then  
> changing it to left join will mean it will definitely be left out so  
> I don’t get what you mean E.Pasma.

I’m talking bollocks again. The left join will only be omitted if it has a 1
to 1 relationship with BaseTbl so E.Pasma is correct.


I’m now wondering if you omit the WHERE & ORDER BY and run the following

EXPLAIN QUERY PLAN
SELECT BaseTbl.RowID 
FROM BaseTbl 
left join Tbl1 on comparison_1 
left join Tbl2 on comparison_2 
. 
. 
left join Tbln on comparison_n

then if it returns more than 1 row then this implies there’s a 1 to many
relationship in the query and we can’t proceed.  (At this stage I’m by no
means sure of this).

Otherwise the RowSQL becomes

SELECT BaseTbl.RowID 
FROM BaseTbl 
jointype_1 Tbl1 on comparison_1 
jointype_2 Tbl2 on comparison_2 
. 
. 
jointype_n Tbln on comparison_n
WHERE ...
ORDER BY ...

And we leave the SQLite optimiser to filter out any left joins that aren’t
required.






--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-28 Thread x
>What about changing the remaining inner join to left join

>Select BaseTbl.RowID
>from BaseTbl
>left join Tbl_2 on Tbl2.Y = BaseTbl.Y
>where BaseTbl.Col=?

>and see if the SQLiter optimizer now leaves Tbl_2 out from the query
>plan.  It will only do that if it is not a 1-to-n join.

If Tbl_2 isn’t involved in the columns, where or order by then changing it to 
left join will mean it will definitely be left out so I don’t get what you mean 
E.Pasma.


I’ve now realised what caused my earlier confusion regarding inner joins. I was 
getting mixed up with foreign keys which do offer a way of finding redundant 
inner joins.

If a Tbl (which is not involved in the query columns, where or order by) is 
inner joined to BaseTbl on all the columns of a unique index and it turns out 
there is a foreign key matching that join then the Tbl can be left out if the 
BaseTbl columns are defined as NOT NULL. If they are allowed to be NULL then it 
can still be left out provided the NOT NULL condition(s) is added to the where.

Using the earlier example

Select BaseTbl.RowID
from BaseTbl
left join Tbl_1 on Tbl_1.X = BaseTbl.X
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
left join Tbl_3 on Tbl3.Z = BaseTbl.Z
where BaseTbl.Col=?

If BaseTbl has FOREIGN KEY(Y) REFERENCES Tbl_2(Y) then

If BaseTbl.Y is defined as NOT NULL the query can be reduced to

Select BaseTbl.RowID
from BaseTbl
where BaseTbl.Col=?

If BaseTbl.Y allows NULLs then the query can be reduced to

Select BaseTbl.RowID
from BaseTbl
where BaseTbl.Col=? and BaseTbl.Y IS NOT NULL

Could someone confirm I’ve got the above right?

If I’m correct, I wonder if the optimiser takes (or could take) this into 
account. As lookup tables are probably the main use of foreign keys you’d think 
there’d be no shortage of such joins. That said, I suppose if you’re aware of 
what you want the optimiser to do it wouldn’t offer any advantage over left 
joining and adding the appropriate NOT NULL constraint(s) to the WHERE.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread E.Pasma


Op 27 nov 2017, om 20:51 heeft x het volgende geschreven:


So if I build a view that includes look-ups in other tables, the
optimizer may skip these at places where not selected. However only  
if

the look-ups are written as outer joins. Then it may be good practice
allways doing that. For instance:



create view vtrack as
select  trackname, artistname
from track
left join artist ON trackartist=artistid -- note left join

;
~
~

Yeah, I was thinking always to use left joins when it doesn’t make  
any difference. I’m using a C++ wrapper for SQLite I wrote myself.  
It contains a QueryGrid type that automatically splits the SQL into  
RowSQL and ColSQL as described earlier. It is still possible to  
build a vector of BaseTbl.RowIDs when the original query contains  
inner joins provided all columns of the unique index used in the  
join are satisfied (see ***Example). If it’s unable to do that (due  
to 1 to many joins)  it computes ‘select count(*) from .’ and  
sets ColSQL to ‘select ColList from ... limit ?1, ?2’ so that it  
still fits in with the ‘fetch a range of data’ modus operandi and  
also allows a record counter and vertical scrollbar positioning.


***Example

select ColList
from BaseTbl
left join Tbl_1 on Tbl_1.X = BaseTbl.X
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
left join Tbl_3 on Tbl3.Z = BaseTbl.Z
where BaseTbl.Col=?

Provided Tbl_2 has unique index on Y the RowID list could be had from

Select BaseTbl.RowID
from BaseTbl
left join Tbl_1 on Tbl_1.X = BaseTbl.X
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
left join Tbl_3 on Tbl3.Z = BaseTbl.Z
where BaseTbl.Col=?

which the SQLite query optimiser would presumably reduce to

Select BaseTbl.RowID
from BaseTbl
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
where BaseTbl.Col=?

I’ve still to write code to check the query contains no joins that  
are ‘1 to many’. I don’t suppose there’s an easy way of determining  
this from explain or explain query plan? At the moment all i can  
think of is to scrape the index name from the explain query plan  
Detail column, check the number of variables involved (i.e. number  
of ? marks) and compare with the number of columns in the index.



What about changing the remaining inner join to left join

Select BaseTbl.RowID
from BaseTbl
left join Tbl_2 on Tbl2.Y = BaseTbl.Y
where BaseTbl.Col=?

and see if the SQLiter optimizer now leaves Tbl_2 out from the query  
plan.  It will only do that if it is not a 1-to-n join.
But that leaves you with the change that the join is 1-to-0 so to say.  
I understood that was a show stopper.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread x
>So if I build a view that includes look-ups in other tables, the
>optimizer may skip these at places where not selected. However only if
>the look-ups are written as outer joins. Then it may be good practice
>allways doing that. For instance:

>create view vtrack as
>select  trackname, artistname
>from track
>left join artist ON trackartist=artistid -- note left join
;
~
~

Yeah, I was thinking always to use left joins when it doesn’t make any 
difference. I’m using a C++ wrapper for SQLite I wrote myself. It contains a 
QueryGrid type that automatically splits the SQL into RowSQL and ColSQL as 
described earlier. It is still possible to build a vector of BaseTbl.RowIDs 
when the original query contains inner joins provided all columns of the unique 
index used in the join are satisfied (see ***Example). If it’s unable to do 
that (due to 1 to many joins)  it computes ‘select count(*) from .’ and 
sets ColSQL to ‘select ColList from ... limit ?1, ?2’ so that it still fits in 
with the ‘fetch a range of data’ modus operandi and also allows a record 
counter and vertical scrollbar positioning.

***Example

select ColList
from BaseTbl
left join Tbl_1 on Tbl_1.X = BaseTbl.X
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
left join Tbl_3 on Tbl3.Z = BaseTbl.Z
where BaseTbl.Col=?

Provided Tbl_2 has unique index on Y the RowID list could be had from

Select BaseTbl.RowID
from BaseTbl
left join Tbl_1 on Tbl_1.X = BaseTbl.X
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
left join Tbl_3 on Tbl3.Z = BaseTbl.Z
where BaseTbl.Col=?

which the SQLite query optimiser would presumably reduce to

Select BaseTbl.RowID
from BaseTbl
Inner join Tbl_2 on Tbl2.Y = BaseTbl.Y
where BaseTbl.Col=?

I’ve still to write code to check the query contains no joins that are ‘1 to 
many’. I don’t suppose there’s an easy way of determining this from explain or 
explain query plan? At the moment all i can think of is to scrape the index 
name from the explain query plan Detail column, check the number of variables 
involved (i.e. number of ? marks) and compare with the number of columns in the 
index.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread E.Pasma
So if I build a view that includes look-ups in other tables, the  
optimizer may skip these at places where not selected. However only if  
the look-ups are written as outer joins. Then it may be good practice  
allways doing that. For instance:


create view vtrack as
select  trackname, artistname
from track
left join artist ON trackartist=artistid -- note left join
;
~
~

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread x
>Thanks to you, this topic has inspired a useful change or changes.

Wow. Glad I could help.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread E.Pasma

x wrote:


From: E.Pasma<mailto:pasm...@concepts.nl>
Sent: 26 November 2017 17:30
To: SQLite mailing list<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Getting an advance list of RowIDs for a query  
result set



If step 3 is xxx-ed and only left-joins remain to be considered then
the SQLite3 engine is likely to fullfill  this optimization in its
next release.
I tested this on the current  (2017-11-17) pre-release snapshot. As
far as I see any outer joined table may be discarded from the query
plan if only rowid from the base table is selected. Using SELECT
DISTINCT applies this also to one-to-many joins.


Thanks E. Pasma. Fixing the left join optimisation will be enough  
for me and will allow me to delete a fair amount of difficult code.  
My confusion over the inner joins arose out of trying to find a way  
of ensuring the BaseTbl RowIDs wouldn’t contain any duplicates.  
Somewhere along the line I forgot SQLite would have to still check  
the record existed in the inner joined table.



Thanks to you, this topic has inspired a useful change or changes.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-27 Thread x


From: E.Pasma<mailto:pasm...@concepts.nl>
Sent: 26 November 2017 17:30
To: SQLite mailing list<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Getting an advance list of RowIDs for a query result set

>If step 3 is xxx-ed and only left-joins remain to be considered then
>the SQLite3 engine is likely to fullfill  this optimization in its
>next release.
>I tested this on the current  (2017-11-17) pre-release snapshot. As
>far as I see any outer joined table may be discarded from the query
>plan if only rowid from the base table is selected. Using SELECT
>DISTINCT applies this also to one-to-many joins.

Thanks E. Pasma. Fixing the left join optimisation will be enough for me and 
will allow me to delete a fair amount of difficult code. My confusion over the 
inner joins arose out of trying to find a way of ensuring the BaseTbl RowIDs 
wouldn’t contain any duplicates. Somewhere along the line I forgot SQLite would 
have to still check the record existed in the inner joined table.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-26 Thread E.Pasma

x wrote:


I proceed as follows


 1.  Omit a table join from the SQL and try preparing it.
 2.  If it prepares OK then the table isn’t involved in the WHERE or  
ORDER BY.
 3.  If it’s joined to the BaseTbl by an integer primary key or  
FULLY joined by a unique index then the table is redundant. By FULLY  
I mean ALL fields of the index are included in the join as otherwise  
the BaseTbl’s relationship with it is of a ONE TO MANY nature which  
means the BaseTbl RowID won’t uniquely identify a row of the  
original query. [Two things I’m unsure about are a) how nulls affect  
unique index  joins and b) how to deal with tables that aren’t  
directly linked to the BaseTbl (i.e. they’re linked via an  
intermediate table)].

 4.  If the table is needed reintroduce it into the SQL.

Do this in turn for each of the joins.

If step 3 is xxx-ed and only left-joins remain to be considered then  
the SQLite3 engine is likely to fullfill  this optimization in its  
next release.
I tested this on the current  (2017-11-17) pre-release snapshot. As  
far as I see any outer joined table may be discarded from the query  
plan if only rowid from the base table is selected. Using SELECT  
DISTINCT applies this also to one-to-many joins.



I’m wondering if the second of those trunk changes is in any way  
related to what I’m trying to do. The above is a bit long winded and  
not easy to code so it would be great if the SQLite query optimizer  
did it all for me.



It looks relevant, but I did not test that.

To me the time taken to grab a grid page of data is negligible if  
you know where to look for it on disc. I tend therefore to time  
queries by how fast I can get all the BaseTbl RowIDs into a vector.  
The biggest table in my database has 2.4 million rows and yet wait  
cursors are a very rare sight.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-26 Thread x
>  If it’s joined to the BaseTbl by an integer primary key or FULLY joined by a 
> unique index then the table is redundant.

I’m talking  there. If it’s an inner join SQLite needs to check the record 
exists in the joined table.

Sorry about that, back to left joins.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-26 Thread x
Thanks Keith.

This one

Update the omit-table-from-left-join optimization so that it can omit tables 
from the middle of the join as well as the end.

Deals with the case I mentioned in my last post but this one is also interesting

Fix a problem preventing the planner from identifying scans that visit at most 
one row in cases where that property is guaranteed by a unique, not-null, 
non-IPK column that is the leftmost in its table

The way I was going to tackle the problem of redundant tables was as follows.

Given

SELECT ColA, ColB, ColC, 
FROM BaseTbl
jointype_1 Tbl1 on comparison_1
jointype_2 Tbl2 on comparison_2
.
.
jointype_n Tbln on comparison_n
WHERE 
ORDER BY 

I want to find the redundant tables in the following query

SELECT BaseTbl.RowID
FROM BaseTbl
jointype_1 Tbl1 on comparison_1
jointype_2 Tbl2 on comparison_2
.
.
jointype_n Tbln on comparison_n
WHERE 
ORDER BY 

I proceed as follows


  1.  Omit a table join from the SQL and try preparing it.
  2.  If it prepares OK then the table isn’t involved in the WHERE or ORDER BY.
  3.  If it’s joined to the BaseTbl by an integer primary key or FULLY joined 
by a unique index then the table is redundant. By FULLY I mean ALL fields of 
the index are included in the join as otherwise the BaseTbl’s relationship with 
it is of a ONE TO MANY nature which means the BaseTbl RowID won’t uniquely 
identify a row of the original query. [Two things I’m unsure about are a) how 
nulls affect unique index  joins and b) how to deal with tables that aren’t 
directly linked to the BaseTbl (i.e. they’re linked via an intermediate table)].
  4.  If the table is needed reintroduce it into the SQL.

Do this in turn for each of the joins.

I’m wondering if the second of those trunk changes is in any way related to 
what I’m trying to do. The above is a bit long winded and not easy to code so 
it would be great if the SQLite query optimizer did it all for me.

To me the time taken to grab a grid page of data is negligible if you know 
where to look for it on disc. I tend therefore to time queries by how fast I 
can get all the BaseTbl RowIDs into a vector. The biggest table in my database 
has 2.4 million rows and yet wait cursors are a very rare sight.

Tom


From: Keith Medcalf<mailto:kmedc...@dessus.com>
Sent: 25 November 2017 18:15
To: SQLite mailing list<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Getting an advance list of RowIDs for a query result set


>This is fixed in the current head of trunk.  Although the implementation may 
>change, it will appear in the next release.

https://www.sqlite.org/src/timeline?n=50



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-25 Thread Keith Medcalf

This is fixed in the current head of trunk.  Although the implementation may 
change, it will appear in the next release.

https://www.sqlite.org/src/timeline?n=50

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of x
>Sent: Saturday, 25 November, 2017 10:08
>To: SQLite mailing list
>Subject: Re: [sqlite] Getting an advance list of RowIDs for a query
>result set
>
>Simon, I’ve no users. I’ve been teaching myself c++ (and SQLite)
>during a lengthy illness so this is just experimentation for me.
>
>
>
>This subject touches on a previous question of mine you were involved
>in regarding redundant tables. The following may jog your memory
>
>
>
>create table TblA(A integer primary key, B int, C int);
>
>create table TblB(B integer primary key, BX int);
>
>create table TblC(C integer primary key, CX int);
>
>
>
>explain query plan select A from TblA left join TblB using (B) left
>join TblC using (C) where BX=?;
>
>0|0|0|SCAN TABLE TblA
>
>0|1|1|SEARCH TABLE TblB USING INTEGER PRIMARY KEY (rowid=?)
>
>
>
>explain query plan select A from TblA left join TblB using (B) left
>join TblC using (C) where CX=?;
>
>0|0|0|SCAN TABLE TblA
>
>0|1|1|SEARCH TABLE TblB USING INTEGER PRIMARY KEY (rowid=?)
>
>0|2|2|SEARCH TABLE TblC USING INTEGER PRIMARY KEY (rowid=?)
>
>
>
>In the first explain SQLite drops the trailing redundant table but in
>the second explain it doesn’t drop the middle redundant table.
>
>
>
>As TblB is included in the second query it must surely run slower
>than if it were omitted.
>
>
>
>I’m not complaining about the SQLite optimiser failing to spot the
>redundancy as it’s got to deal with a variety of queries far removed
>from my narrow experiment. Checking for such redundancies would
>likely slow down prepares and, when it comes down to it, anyone
>including TblB in the second query is only getting what they asked
>for.
>
>
>
>I do think though that it’s possible to write code to remove these
>redundancies so as to get the vector of RowIDs as fast as possible.
>So far I’ve been splitting SQL into ‘RowSQL’ (returns RowIDs involved
>in correct order) and ‘ColSQL’ (returns columns requested in original
>SQL for the requested range as shown in my second post) but I’ve only
>been doing it visually via knowledge of the tables. What I’m trying
>to do is write a function to automatically ‘split’ the sql into
>RowSQL and ColSQL. I’ll make another post later showing where I’m at
>with that.
>
>
>From: sqlite-users  on
>behalf of Simon Slavin 
>Sent: Saturday, November 25, 2017 1:26:00 PM
>To: SQLite mailing list
>Subject: Re: [sqlite] Getting an advance list of RowIDs for a query
>result set
>
>
>
>On 25 Nov 2017, at 1:15pm, curmudgeon  wrote:
>
>> Given a select where a 'base table' is attached to lookup tables
>> how can I determine which of the lookup tables can be removed from
>the table
>> such that
>>
>> select BaseTbl.RowID from ... where ... order by ...
>>
>> will find the set of records that represents the original query.
>
>That optimization could be done at the level of the SQL engine.  You
>wouldn’t want to do it inside your own code since that would make
>your code extremely complicated.  So just execute the query without
>trying to optimize it and see what happens.  Does it run fast enough
>for your users ?
>
>Simon.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-25 Thread x
Simon, I’ve no users. I’ve been teaching myself c++ (and SQLite) during a 
lengthy illness so this is just experimentation for me.



This subject touches on a previous question of mine you were involved in 
regarding redundant tables. The following may jog your memory



create table TblA(A integer primary key, B int, C int);

create table TblB(B integer primary key, BX int);

create table TblC(C integer primary key, CX int);



explain query plan select A from TblA left join TblB using (B) left join TblC 
using (C) where BX=?;

0|0|0|SCAN TABLE TblA

0|1|1|SEARCH TABLE TblB USING INTEGER PRIMARY KEY (rowid=?)



explain query plan select A from TblA left join TblB using (B) left join TblC 
using (C) where CX=?;

0|0|0|SCAN TABLE TblA

0|1|1|SEARCH TABLE TblB USING INTEGER PRIMARY KEY (rowid=?)

0|2|2|SEARCH TABLE TblC USING INTEGER PRIMARY KEY (rowid=?)



In the first explain SQLite drops the trailing redundant table but in the 
second explain it doesn’t drop the middle redundant table.



As TblB is included in the second query it must surely run slower than if it 
were omitted.



I’m not complaining about the SQLite optimiser failing to spot the redundancy 
as it’s got to deal with a variety of queries far removed from my narrow 
experiment. Checking for such redundancies would likely slow down prepares and, 
when it comes down to it, anyone including TblB in the second query is only 
getting what they asked for.



I do think though that it’s possible to write code to remove these redundancies 
so as to get the vector of RowIDs as fast as possible. So far I’ve been 
splitting SQL into ‘RowSQL’ (returns RowIDs involved in correct order) and 
‘ColSQL’ (returns columns requested in original SQL for the requested range as 
shown in my second post) but I’ve only been doing it visually via knowledge of 
the tables. What I’m trying to do is write a function to automatically ‘split’ 
the sql into RowSQL and ColSQL. I’ll make another post later showing where I’m 
at with that.


From: sqlite-users  on behalf of 
Simon Slavin 
Sent: Saturday, November 25, 2017 1:26:00 PM
To: SQLite mailing list
Subject: Re: [sqlite] Getting an advance list of RowIDs for a query result set



On 25 Nov 2017, at 1:15pm, curmudgeon  wrote:

> Given a select where a 'base table' is attached to lookup tables
> how can I determine which of the lookup tables can be removed from the table
> such that
>
> select BaseTbl.RowID from ... where ... order by ...
>
> will find the set of records that represents the original query.

That optimization could be done at the level of the SQL engine.  You wouldn’t 
want to do it inside your own code since that would make your code extremely 
complicated.  So just execute the query without trying to optimize it and see 
what happens.  Does it run fast enough for your users ?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-25 Thread Simon Slavin


On 25 Nov 2017, at 1:15pm, curmudgeon  wrote:

> Given a select where a 'base table' is attached to lookup tables
> how can I determine which of the lookup tables can be removed from the table
> such that
> 
> select BaseTbl.RowID from ... where ... order by ...
> 
> will find the set of records that represents the original query.

That optimization could be done at the level of the SQL engine.  You wouldn’t 
want to do it inside your own code since that would make your code extremely 
complicated.  So just execute the query without trying to optimize it and see 
what happens.  Does it run fast enough for your users ?

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-25 Thread curmudgeon
Sorry, in last post

select * from (select Value from carray(ID+?1, ?2, 'int64'))
inner join AwfyBigTbl on AwfyBigTbl.RowID = _Value;

by setting ?1 = TopRecNo and ?2 = n.

should read

select * from (select Value from carray(*?1*, ?2, 'int64'))
inner join AwfyBigTbl on AwfyBigTbl.RowID = _Value;

by setting ?1 = *ID + TopRecNo* and ?2 = n.




--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting an advance list of RowIDs for a query result set

2017-11-25 Thread curmudgeon
A trivial example of what I'm trying to do. Given

select * from AwfyBigTbl where ACol=?;

I'd run the query

select RowID from AwfyBigTbl where ACol=?;

step through the records and store the values in a std::vector
called ID.
I could then retrieve n records starting at TopRecNo (0 based) with the
query

select * from (select Value from carray(ID+?1, ?2, 'int64'))
inner join AwfyBigTbl on AwfyBigTbl.RowID = _Value;

by setting ?1 = TopRecNo and ?2 = n.

I realise I could get the same result with

select * from AwfyBigTbl where ACol=? limit n offset TopRecNo;

but the first way will be faster and, as the record count - ID.size() - is
known , it would allow a
record counter to be shown, vertical scrollbar to be sized, last known
record to be relocated etc.

Anyway, what I'm trying to do is find a way of doing the same thing for more
complicated
queries. Given a select where a 'base table' is attached to lookup tables
how can I determine which of the lookup tables can be removed from the table
such that

select BaseTbl.RowID from ... where ... order by ...

will find the set of records that represents the original query.

Hope that's clearer.





--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Getting an advance list of RowIDs for a query result set

2017-11-24 Thread x
For a complex query you can often get a list of the base table RowIDs very 
quickly with a simple query and then use an array of those values (along with 
the carray virtual table) to retrieve sections of data from the complex query 
almost instantly. I've been doing this for a while but would love to find a way 
of doing it automatically. The problem is as follows

Suppose I have the following query

SELECT ColA, ColB, ColC, 
FROM BaseTbl
jointype_1 Tbl1 on comparison_1
jointype_2 Tbl2 on comparison_2
.
.
jointype_n Tbln on comparison_n
WHERE 
ORDER BY 

and I want a list of the BaseTbl's rowid's for all records returned by the 
above query in
the order defined by ORDER BY. I also want it in the fastest possible way. In 
other words I
want

SELECT BaseTbl.RowID
FROM BaseTbl
jointype_1 Tbl1 on comparison_1
jointype_2 Tbl2 on comparison_2
.
.
jointype_n Tbln on comparison_n
WHERE 
ORDER BY 

with all redundant tables removed.

The question is, how to determine what tables are redundant?

Obviously any table involved in the WHERE or ORDER BY has to be retained aa 
does any table
acting as an intermediate to join those tables back to the BaseTbl.

Of the other tables I'm thinking I can discard table_i provided it's joined to 
the BaseTbl by a table_i unique index and comparison_i covers all columns of 
that index. It's fairly obvious if jointype_i == left join but I'm thinking it 
works also for inner join. Can anyone give me a counter example?

Anyone think of other ways?

I'm also wondering if the result produced by 'explain query plan' could 
simplify the coding although I know use of that isn't recommended as the 
structure of it is not guaranteed.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users