Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-15 Thread Jean-Baptiste Gardette

Thank you Keith for the detail explanation.
I misunderstood the 2 replies were opposite but this is not the case.

Thank you again
Jean-bapstiste
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-15 Thread Keith Medcalf

On Wednesday, 15 January, 2020 02:06, Jean-Baptiste Gardette 
 wrote:

> Just to be sure, is it unsafe to write a non agregate SELECT with GROUP
> BY and HAVING clauses (without sub-SELECT) for the sole prupose 
> explained before (even if the approache is discutable) ?

Presently, yes it is.

>I understand 2 different answers here :

>- "No, this kind of query can't be rewritten by the optimizer for the
>technical reasons (VDBE, index etc)"

This is presently the case.  Current versions of the query planner will not 
optimize these queries but will instead execute them as written.  That is 
statements of the form "select ... from ... where ... group by ... having ... 
order by ... limit ... offset ..." will not convert the "group by" into an 
"order by" or the "having" into "where".  The provers required to allow this 
sort of transformation do not exist at present.

>- "Yes it is unsafe, a future version of SQLite may optimize differently
>this kind of query"

Yes.  A future version of SQLite may indeed process the query by turning the 
"group by" into an "order by" (if there isn't one, or just ignoring it if there 
is an order by, or perhaps merging them) and moving the "having" to a "where" 
condition.  Doing this would require that the optimizer recognize that the 
group by expression can only result in single row groups and that neither the 
select list nor the having expression contain aggregate functions.  There is 
almost nothing to be gained from this optimization, however, so it is highly 
unlikely that such provers would be written in order to implement this 
particular optimization.

Contrast this with recent optimizations that have been added, for example, the 
LEFT JOIN optimization which downgrades an outer join into an inner join if it 
can be proved that the overall result will be the same (the extra candidates 
generated by the outer join will be removed from the result set by a where 
clause, so going to all the bother of adding and processing them in the first 
place serves no purpose), or that leaf tables which are not referenced in the 
select list are removed from the query since generating those results merely 
incurs a cost to no effect.  These optimizations can have a significant impact 
on performance.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-15 Thread Jean-Baptiste Gardette
Just to be sure, is it unsafe to write a non agregate SELECT with GROUP 
BY and HAVING clauses (without sub-SELECT)
for the sole prupose explained before (even if the approache is 
discutable) ?


I understand 2 different answers here :

- "No, this kind of query can't be rewritten by the optimizer for the 
technical reasons (VDBE, index etc)"


- "Yes it is unsafe, a future version of SQLite may optimize differently 
this kind of query"


I can't see something related to my problem in the doc 
https://sqlite.org/optoverview.html

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Richard Hipp
On 1/14/20, Keith Medcalf  wrote:
>
> I seem to recall something about "expensive" conditions that will be forced
> to be run on only as few surviving candidate rows as possible, but my
> recollection is vague (they say the memory is the second thing to go --
> strange I can't remember the first).
>
> Anyway, Richard may be able to help here.

Maybe you are thinking of SQLITE_ENABLE_SORTER_REFERENCES.
https://www.sqlite.org/compile.html#enable_sorter_references


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Keith Medcalf

On Tuesday, 14 January, 2020 09:23, Simon Slavin  wrote:

>Would it be possible to phrase your SELECT as a SELECT with a sub-SELECT
>?  Have the sub-SELECT figure out which rows you want in which order,
>then use a SELECT to apply your UDF to them ?  It is guaranteed that the
>sub-SELECT is processed before the SELECT.

you mean if you do something like this:

select *
  from (
 select *
   from ...
  where ...
   )
 where ...

?

If the query flattener is operating the optimizer will push the outer where 
condition into the inner query after which there is no guarantee that the 
conditions in the outer query will be processed *after* the conditions in the 
inner query, especially not if the pushed terms are estimated to reduce the 
number of candidates "quicker" by executing it sooner.

I seem to recall something about "expensive" conditions that will be forced to 
be run on only as few surviving candidate rows as possible, but my recollection 
is vague (they say the memory is the second thing to go -- strange I can't 
remember the first).

Anyway, Richard may be able to help here.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Simon Slavin
On 14 Jan 2020, at 4:14pm, Jean-Baptiste Gardette  wrote:

> The reason i asked this is that i have a query in wich one condition 
> filtering the recordset involves
> an UDF and this UDF needs to be processed after all table filters have been 
> applied

You cannot guarantee this.  And even if you find a solution that works now, the 
optimizer in a future version of SQLite might work differently.

Would it be possible to phrase your SELECT as a SELECT with a sub-SELECT ?  
Have the sub-SELECT figure out which rows you want in which order, then use a 
SELECT to apply your UDF to them ?  It is guaranteed that the sub-SELECT is 
processed before the SELECT.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Jean-Baptiste Gardette

Thank you Dominic and Keith for your replies

The reason i asked this is that i have a query in wich one condition 
filtering the recordset involves
an UDF and this UDF needs to be processed after all table filters have 
been applied


Illustration :

additionnal table :

CREATE TABLE t2 (
a TEXT PRIMARY KEY,
b INTEGER,
c REAL);

UDF :
proc CallUDF {key} {
   if{[dict exists $myDict $key]} {
  dict get $myDict $key  /* returns a value stored in myDict */
   } else {error}
}
db function CallUDF -deterministic CallUDF

If i do :

SELECT *
FROM t1 CROSS JOIN t2
WHERE t1.b > 10
  AND t2.a = t1.a and t2.c < 0
  AND CallUDF(t2.a) <> 0

Even though "CROSS JOIN" garanties that t2 is traversed after t1 has 
been filtered on "t1.b > 10",
it seems there is no garanty when t2 is traversed, that the filter 
CallUDF(t2.a) <> 0 is evaluated *after*

filters "t2.b = t1.b and t2.c < 0".

In this case the sole solution we found is :

SELECT *
FROM t1, t2
WHERE t1.b > 10
  AND t2.a = t1.a and t2.c < 0
GROUP BY t2.a
HAVING CallUDF(t2.a) <> 0

Thank again
Jean-Baptiste
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Keith Medcalf

On Tuesday, 14 January, 2020 06:58, Jean-Baptiste Gardette 
 wrote:

>Consider the following exemple :

>CREATE TABLE t1 (
>a TEXT PRIMARY KEY,
>b INTEGER);

>SELECT *
>FROM t1
>GROUP BY a
>HAVING b > 1;

>Will the GROUP BY clause be supressed and HAVING clause be rewritten in
>WHERE clause by the optimizer ?

No.  The VDBE code will be generated as you have specified.  The "group by a" 
will be used to select the index to be used to access the table data, and at 
the end of the processing of each group of records, the "having b > 1" will be 
applied to the group to determine whether the group is output.

The PRIMARY KEY constraint on a does not make "a" unique since a is not 
constrained not null, so there may be multiple records in the same group.

Even if "a" is constrained not null the query is still processed as a group 
by/having even though each group can only consist of one record even though 
(and only in that case) the query equivalent to "select * from t1 where b > 1 
order by a".

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Dominique Devienne
On Tue, Jan 14, 2020 at 2:57 PM Jean-Baptiste Gardette
 wrote:
> SELECT * FROM t1 GROUP BY a HAVING b > 1;
>
> Will the GROUP BY clause be supressed and HAVING clause be rewritten in WHERE 
> clause by the optimizer ?

My question would be why you wouldn't write it as a WHERE clause in
the first place :)
Sorry, OT, I know. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query Planner GROUP BY and HAVING clauses optimization ?

2020-01-14 Thread Jean-Baptiste Gardette

Hi,
Consider the following exemple :

CREATE TABLE t1 (
a TEXT PRIMARY KEY,
b INTEGER);

SELECT *
FROM t1
GROUP BY a
HAVING b > 1;

Will the GROUP BY clause be supressed and HAVING clause be rewritten in 
WHERE clause by the optimizer ?


Jean-Baptiste
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query regarding Sqlite 3.30.1

2019-12-19 Thread Richard Hipp
On 12/19/19, Uthra Ganesan  wrote:
> Hi
>
> We are currently using SQLite 3.28.0. Can you please lt us know when can we
> expect 3.30.1 in Maven central repository.
>

Please ask whomever controls the "Maven central repository".  They
will make that determination, not us.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query regarding Sqlite 3.30.1

2019-12-19 Thread Uthra Ganesan
Hi

We are currently using SQLite 3.28.0. Can you please lt us know when can we 
expect 3.30.1 in Maven central repository.

Thanks
Uthra
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query freezing on latest SQLite 3.30.1 build

2019-12-05 Thread David Raymond
The mailing list strips off all attachments, so you'll have to provide another 
place to get that.

Alternatively could you post the schema here as text, along with the explain 
query plan output from the slow version and from a fast version?

-Original Message-
From: sqlite-users  On Behalf Of 
Clovis Ribeiro,MyABCM
Sent: Thursday, December 5, 2019 2:04 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Query freezing on latest SQLite 3.30.1 build

Folks,

The following query, when executed against the attached database using SQLite 
3.30.1 (running on Windows OS) will take forever to execute. If we remove all 
columns from both tables that are not actually used in the query, it is 
executed in milliseconds.

SELECT COUNT(*) FROM
(SELECT
 src.member_id src_id,
 dst.member_id dst_id,
 asg.contribution_percentage
FROM
 mdl_assignments asg
INNER JOIN
 mdl_member_instances src ON asg.source_mbi_id = src.id
INNER JOIN
 mdl_member_instances dst ON asg.destination_mbi_id = dst.id
WHERE
 src.period_scenario_id = 1 AND dst.period_scenario_id = 1) T

When testing the same query with older versions of SQLite we used in the past 
(more than 6 years ago), the query also executed in milliseconds.

We have executed several different tests but could not figure out why this 
query hangs on the latest version of SQLite but runs fast in older versions or 
when we remove columns from the tables in the database.

Hope this can help you improve SQLite.

Thanks

Clovis Ribeiro
MyABCM

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query freezing on latest SQLite 3.30.1 build

2019-12-05 Thread Clovis Ribeiro,MyABCM
Folks,

The following query, when executed against the attached database using SQLite 
3.30.1 (running on Windows OS) will take forever to execute. If we remove all 
columns from both tables that are not actually used in the query, it is 
executed in milliseconds.

SELECT COUNT(*) FROM
(SELECT
 src.member_id src_id,
 dst.member_id dst_id,
 asg.contribution_percentage
FROM
 mdl_assignments asg
INNER JOIN
 mdl_member_instances src ON asg.source_mbi_id = src.id
INNER JOIN
 mdl_member_instances dst ON asg.destination_mbi_id = dst.id
WHERE
 src.period_scenario_id = 1 AND dst.period_scenario_id = 1) T

When testing the same query with older versions of SQLite we used in the past 
(more than 6 years ago), the query also executed in milliseconds.

We have executed several different tests but could not figure out why this 
query hangs on the latest version of SQLite but runs fast in older versions or 
when we remove columns from the tables in the database.

Hope this can help you improve SQLite.

Thanks

Clovis Ribeiro
MyABCM

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-09-06 Thread Simon Slavin
On 6 Sep 2019, at 7:36am, Rowan Worth  wrote:

> I was surprised when this behaved differently in other SQL engines. eg. in
> SQLite you can write:
> 
> SELECT col1, col2 FROM table1, table2 USING 

But please don't, for the reason you gave.  Not only is it ambiguous but 
different SQL engines interpret it differently.  SQLite uses PostgreSQL as a 
model for many things but here they diverge.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-09-06 Thread Rowan Worth
On Tue, 3 Sep 2019 at 22:17, Keith Medcalf  wrote:

> And the "," in the list of tables may be replaced by the word JOIN.  It is
> merely an alternate spelling.
>

I was surprised when this behaved differently in other SQL engines. eg. in
SQLite you can write:

SELECT col1, col2 FROM table1, table2 USING (commonId)

But in eg. postgres it must be written using "table1 JOIN table2" rather
than the comma, because postgres treats "table1, table2" as "table1 JOIN
tabel2 ON TRUE" resulting in a conflict with the USING clause.

-Rowan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner regression with FTS4: slower path is chosen

2019-09-04 Thread Dan Kennedy


On 2/9/62 16:57, Paul wrote:

I has been a while without response, so I just bumping this message.


19 July 2019, 14:21:27, by "Paul" :


I have a test case when the regression can be observed in queries that
use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0.
For some reason the planner decides to search non-FTS table first then
scan the whole FTS table. Version 3.22.0 is the last unaffected, while
issue is still present in HEAD.

Probably it has something to do with a fact that, according to EXPLAIN,
new version of planner ignores LEFT join and considers it just a JOIN.



Suspect that that is the change. The LEFT JOIN is equivalent to a 
regular join in this case due to the "bar = 1" term in the WHERE clause.


Running ANALYZE after the index is created in the example script causes 
SQLite to pick a better plan.


Or, changing the LEFT JOIN to CROSS JOIN works to force SQLite to pick 
the plan you want.


FTS5 does a little better with the query, but only because it runs 
faster - it still picks the slow plan. There might be room to improve 
this in FTS5, but probably not for FTS3/4, which are now focused very 
much on backwards compatibility.


Dan.





At least it feels that way, anyway.

Test case:


CREATE VIRTUAL TABLE search USING FTS4(text);

WITH RECURSIVE
  cnt(x) AS (
 SELECT 1
 UNION ALL
 SELECT x+1 FROM cnt
  LIMIT 2000
  )
INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt;

CREATE TABLE foo(s_docid integer primary key, bar integer);

WITH RECURSIVE
  cnt(x) AS (
 SELECT 1
 UNION ALL
 SELECT x+1 FROM cnt
  LIMIT 2000
  )
INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt;

.timer on

-- Fast
SELECT COUNT() FROM search LEFT JOIN foo
  ON s_docid = docid
  WHERE bar = 1 AND search MATCH 'test*';

-- Fast
SELECT COUNT() FROM foo
  WHERE bar = 1
  AND s_docid IN (
SELECT docid FROM search WHERE search MATCH 'test*'
  );

-- Create index, as some real-life queries use searches by `bar`
CREATE INDEX foo_bar_idx ON foo (bar);

-- Slow
SELECT COUNT() FROM search LEFT JOIN foo
  ON s_docid = docid
  WHERE bar = 1 AND search MATCH 'test*';

-- As fast as before (current workaround)
SELECT COUNT() FROM foo
  WHERE bar = 1
  AND s_docid IN (
SELECT docid FROM search WHERE search MATCH 'test*'
  );


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-09-03 Thread Keith Medcalf

On Tuesday, 3 September, 2019 07:19, dboland9  wrote:

>Thanks for the info.  As I understand things, this is an implicit join.  

The syntax for a SELECT statement goes like this:

SELECT 
FROM 
WHERE 
GROUP BY 
HAVING 
ORDER BY 

>I did try it, and it works just fine.  However, in my reading it appears that
>implicit joins are discouraged, and will only do left joins.

Implicit joins (whatever that means) are not discouraged.  And the "," in the 
list of tables may be replaced by the word JOIN.  It is merely an alternate 
spelling.  And the word ON is merely an alternate spelling of AND (plus some 
parenthesis -- you put parenthesis around the existing WHERE clause, put 
parenthesis around the ON clause, change the word ON to AND, and tack the 
result to the end of the (now parenthesized) WHERE clause).  

Using "commas" instead of "JOIN" there is no way to specify a particular join 
type, so "," always means "INNER JOIN" -- so if you need to specify a JOIN type 
other than an plain projection (inner) you must use the table " 
table" specification.  Except for OUTER JOIN operations, the ON clause is 
merely a WHERE condition (in outer joins the ON clause binds the table that is 
being outer joined).  In the olden days one used the special operator *= or =* 
or *=* to indicate outer joins in the where clause).  The ON clause does not 
even need to contain references to tables that have already been mentioned 
since they are merely syntactic sugar.  You can even have one without using the 
word JOIN at all as in "SELECT a,b,c FROM x,y,z ON x.a = y.b and y.g = z.c 
WHERE z.c = 5 or x.a = 3" which translates to "SELECT a,b,c FROM x,y,z WHERE 
(z.c = 5 or x.a = 3) and x.a = y.b and y.g = z.c"

You will notice that all the things in the SELECT statement are defined as "I 
want".  This is what makes SQL a declarative language -- you declare what you 
want, and it figures out how to go get it.

>Also, it looks like you are using aliasing to shorten the query strings -
>true or false?

Yes.  The tables are aliased and the as keyword is omitted.

>‐‐‐ Original Message ‐‐‐
>On Tuesday, September 3, 2019 7:32 AM, John G 
>wrote:
>
>> Or without the added calories (syntactic sugar) :
>>
>> select a., b.
>> from author_books ab, author a, books b
>> where a.author_id = ab.author_id
>> and b.book_isbn = ab.book_isbn
>>
>> On Tue, 27 Aug 2019 at 15:52, David Raymond david.raym...@tomtom.com
>> wrote:
>>
>> > It does support natural joins. > > changes" comments here>
>> > USING needs parenthesis around the column list: ...using
>> > (author_id)...using (book_isbn)...
>> > -Original Message-
>> > From: sqlite-users sqlite-users-boun...@mailinglists.sqlite.org On
>> > Behalf Of Dominique Devienne
>> > Sent: Tuesday, August 27, 2019 10:08 AM
>> > To: SQLite mailing list sqlite-users@mailinglists.sqlite.org
>> > Subject: Re: [sqlite] Query for Many to Many
>> > On Tue, Aug 27, 2019 at 4:00 PM Dominique Devienne ddevie...@gmail.com
>> > wrote:
>> >
>> > > select author., books.
>> > > from author_books
>> > > join author on author.author_id = author_books.author_id
>> > > join books on books.book_isbn = author_books.book_isbn
>> >
>> > Which can also be written:
>> > select author., books.
>> > from author_books
>> > join author using author_id
>> > join books using book_isbn
>> > Or even:
>> > select author., books.
>> > from author_books
>> > natural join author
>> > natural join books
>> > All of the above untested of course :).
>> > Not even sure SQLite supports natural join or not (I'd guess it does).
>--DD
>> > https://stackoverflow.com/questions/45531762/sql-join-using-vs-join-on
>> > https://stackoverflow.com/questions/8696383/difference-between-natural-
>join-and-inner-join
>> >
>> > sqlite-users mailing list
>> > sqlite-users@mailinglists.sqlite.org
>> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> >
>> > sqlite-users mailing list
>> > sqlite-users@mailinglists.sqlite.org
>> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-09-03 Thread dboland9
Thanks for the info.  As I understand things, this is an implicit join.  I did 
try it, and it works just fine.  However, in my reading it appears that 
implicit joins are discouraged, and will only do left joins.

Also, it looks like you are using aliasing to shorten the query strings - true 
or false?

Dave,


Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Tuesday, September 3, 2019 7:32 AM, John G  wrote:

> Or without the added calories (syntactic sugar) :
>
> select a., b.
> from author_books ab, author a, books b
> where a.author_id = ab.author_id
> and b.book_isbn = ab.book_isbn
>
> On Tue, 27 Aug 2019 at 15:52, David Raymond david.raym...@tomtom.com
> wrote:
>
> > It does support natural joins.  > changes" comments here>
> > USING needs parenthesis around the column list: ...using
> > (author_id)...using (book_isbn)...
> > -Original Message-
> > From: sqlite-users sqlite-users-boun...@mailinglists.sqlite.org On
> > Behalf Of Dominique Devienne
> > Sent: Tuesday, August 27, 2019 10:08 AM
> > To: SQLite mailing list sqlite-users@mailinglists.sqlite.org
> > Subject: Re: [sqlite] Query for Many to Many
> > On Tue, Aug 27, 2019 at 4:00 PM Dominique Devienne ddevie...@gmail.com
> > wrote:
> >
> > > select author., books.
> > > from author_books
> > > join author on author.author_id = author_books.author_id
> > > join books on books.book_isbn = author_books.book_isbn
> >
> > Which can also be written:
> > select author., books.
> > from author_books
> > join author using author_id
> > join books using book_isbn
> > Or even:
> > select author., books.
> > from author_books
> > natural join author
> > natural join books
> > All of the above untested of course :).
> > Not even sure SQLite supports natural join or not (I'd guess it does). --DD
> > https://stackoverflow.com/questions/45531762/sql-join-using-vs-join-on
> > https://stackoverflow.com/questions/8696383/difference-between-natural-join-and-inner-join
> >
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-09-03 Thread John G
Or without the added calories (syntactic sugar) :

select a.*, b.*
from author_books ab, author a, books b
where  a.author_id  = ab.author_id
  and  b.book_isbn = ab.book_isbn

On Tue, 27 Aug 2019 at 15:52, David Raymond 
wrote:

> It does support natural joins.  changes" comments here>
>
> USING needs parenthesis around the column list: ...using
> (author_id)...using (book_isbn)...
>
>
> -Original Message-
> From: sqlite-users  On
> Behalf Of Dominique Devienne
> Sent: Tuesday, August 27, 2019 10:08 AM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Query for Many to Many
>
> On Tue, Aug 27, 2019 at 4:00 PM Dominique Devienne 
> wrote:
>
> > select author.*, books.*
> >   from author_books
> >   join author on author.author_id  = author_books.author_id
> >   join books  on books.book_isbn   = author_books.book_isbn
> >
>
> Which can also be written:
>
> select author.*, books.*
>   from author_books
>   join author using author_id
>   join books  using book_isbn
>
> Or even:
>
> select author.*, books.*
>   from author_books
>   natural join author
>   natural join books
>
> All of the above untested of course :).
> Not even sure SQLite supports natural join or not (I'd guess it does). --DD
>
> https://stackoverflow.com/questions/45531762/sql-join-using-vs-join-on
>
> https://stackoverflow.com/questions/8696383/difference-between-natural-join-and-inner-join
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner regression with FTS4: slower path is chosen

2019-09-02 Thread Paul
I has been a while without response, so I just bumping this message.


19 July 2019, 14:21:27, by "Paul" :

> I have a test case when the regression can be observed in queries that
> use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0. 
> For some reason the planner decides to search non-FTS table first then
> scan the whole FTS table. Version 3.22.0 is the last unaffected, while
> issue is still present in HEAD.
> 
> Probably it has something to do with a fact that, according to EXPLAIN,
> new version of planner ignores LEFT join and considers it just a JOIN.
> At least it feels that way, anyway.
> 
> Test case:
> 
> 
> CREATE VIRTUAL TABLE search USING FTS4(text);
> 
> WITH RECURSIVE
>  cnt(x) AS (
> SELECT 1
> UNION ALL
> SELECT x+1 FROM cnt
>  LIMIT 2000
>  )
> INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt;
> 
> CREATE TABLE foo(s_docid integer primary key, bar integer);
> 
> WITH RECURSIVE
>  cnt(x) AS (
> SELECT 1
> UNION ALL
> SELECT x+1 FROM cnt
>  LIMIT 2000
>  )
> INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt;
> 
> .timer on
> 
> -- Fast
> SELECT COUNT() FROM search LEFT JOIN foo  
>  ON s_docid = docid 
>  WHERE bar = 1 AND search MATCH 'test*';
> 
> -- Fast
> SELECT COUNT() FROM foo 
>  WHERE bar = 1 
>  AND s_docid IN (
>SELECT docid FROM search WHERE search MATCH 'test*'
>  );
> 
> -- Create index, as some real-life queries use searches by `bar`
> CREATE INDEX foo_bar_idx ON foo (bar);
> 
> -- Slow
> SELECT COUNT() FROM search LEFT JOIN foo
>  ON s_docid = docid 
>  WHERE bar = 1 AND search MATCH 'test*';
> 
> -- As fast as before (current workaround)
> SELECT COUNT() FROM foo 
>  WHERE bar = 1 
>  AND s_docid IN (
>SELECT docid FROM search WHERE search MATCH 'test*'
>  );
> 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-08-27 Thread David Raymond
It does support natural joins. 

USING needs parenthesis around the column list: ...using (author_id)...using 
(book_isbn)...


-Original Message-
From: sqlite-users  On Behalf Of 
Dominique Devienne
Sent: Tuesday, August 27, 2019 10:08 AM
To: SQLite mailing list 
Subject: Re: [sqlite] Query for Many to Many

On Tue, Aug 27, 2019 at 4:00 PM Dominique Devienne 
wrote:

> select author.*, books.*
>   from author_books
>   join author on author.author_id  = author_books.author_id
>   join books  on books.book_isbn   = author_books.book_isbn
>

Which can also be written:

select author.*, books.*
  from author_books
  join author using author_id
  join books  using book_isbn

Or even:

select author.*, books.*
  from author_books
  natural join author
  natural join books

All of the above untested of course :).
Not even sure SQLite supports natural join or not (I'd guess it does). --DD

https://stackoverflow.com/questions/45531762/sql-join-using-vs-join-on
https://stackoverflow.com/questions/8696383/difference-between-natural-join-and-inner-join
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-08-27 Thread Dominique Devienne
On Tue, Aug 27, 2019 at 4:00 PM Dominique Devienne 
wrote:

> select author.*, books.*
>   from author_books
>   join author on author.author_id  = author_books.author_id
>   join books  on books.book_isbn   = author_books.book_isbn
>

Which can also be written:

select author.*, books.*
  from author_books
  join author using author_id
  join books  using book_isbn

Or even:

select author.*, books.*
  from author_books
  natural join author
  natural join books

All of the above untested of course :).
Not even sure SQLite supports natural join or not (I'd guess it does). --DD

https://stackoverflow.com/questions/45531762/sql-join-using-vs-join-on
https://stackoverflow.com/questions/8696383/difference-between-natural-join-and-inner-join
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-08-27 Thread Dominique Devienne
On Tue, Aug 27, 2019 at 3:38 PM dboland9  wrote:

> I need some help writing some queries for a MTM relationship.  The example
> tables are:
>
> author table books table author_books table
> author_id PKbook_isbn PKa_b_id PK
> author_fnamebook_title  author_id FK
> author_lnamebook_pub_date   book_isbn FK
> author_minit
>
>
> Listings desired:
> book_isbn   book_title  book_pub_date   author
> --++--+---
>
> author book_isbnBook_title
> +-+
>
> Would appreciate the query (inner join - that I do know), and why so I can
> learn something from them.  Please keep them simple (no alias or other
> shortcuts) so I can easily follow what you are doing.  Thanks in advance.
>

Well, that's all you need, inner join, just two of them. Nothing difficult
here IMHO. Or I'm missing something. --DD


> I assume the query will be something like:
>   SELECT
> books.book_isbn, books.book_title, books.book_pub_date,
> author.author_fname, author.author_minit,
> author.author_lname
>   FROM books
>   JOIN
> author_books ON (something )


select author.*, books.*
  from author_books
  join author on author.author_id  = author_books.author_id
  join books  on books.book_isbn   = author_books.book_isbn
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query for Many to Many

2019-08-27 Thread David Raymond
The basic query is going to be the below

select stuff
from

books
inner join author_books
on author_books.book_isbn = books.book_isbn
inner join author
on author_books.author_id = author.author_id

where things;



-Original Message-
From: sqlite-users  On Behalf Of 
dboland9
Sent: Tuesday, August 27, 2019 9:38 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Query for Many to Many

All,

I need some help writing some queries for a MTM relationship.  The example 
tables are:

author table books table author_books table
author_id PKbook_isbn PKa_b_id PK
author_fnamebook_title  author_id FK
author_lnamebook_pub_date   book_isbn FK
author_minit


Listings desired:
    book_isbn   book_title  book_pub_date   author
    --++--+---
   
    author book_isbn    Book_title
    +-+

Would appreciate the query (inner join - that I do know), and why so I can 
learn something from them.  Please keep them simple (no alias or other 
shortcuts) so I can easily follow what you are doing.  Thanks in advance.

I assume the query will be something like:
  SELECT
books.book_isbn, books.book_title, books.book_pub_date,
author.author_fname, author.author_minit,
author.author_lname
  FROM books
  JOIN
author_books ON (something )

Dave,



Sent with ProtonMail Secure Email.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query for Many to Many

2019-08-27 Thread dboland9
All,

I need some help writing some queries for a MTM relationship.  The example 
tables are:

author table books table author_books table
author_id PKbook_isbn PKa_b_id PK
author_fnamebook_title  author_id FK
author_lnamebook_pub_date   book_isbn FK
author_minit


Listings desired:
    book_isbn   book_title  book_pub_date   author
    --++--+---
   
    author book_isbn    Book_title
    +-+

Would appreciate the query (inner join - that I do know), and why so I can 
learn something from them.  Please keep them simple (no alias or other 
shortcuts) so I can easily follow what you are doing.  Thanks in advance.

I assume the query will be something like:
  SELECT
books.book_isbn, books.book_title, books.book_pub_date,
author.author_fname, author.author_minit,
author.author_lname
  FROM books
  JOIN
author_books ON (something )

Dave,



Sent with ProtonMail Secure Email.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Builder Access

2019-07-21 Thread Simon Slavin
On 20 Jul 2019, at 7:33pm, Revere Perkins  wrote:

> Is the SQL Query Builder only available in the licensed version? 

Sorry.  SQLite is always completely free, and there's no Query Builder included 
as part of SQLite.  You must be asking about a third-party product.  And there 
are lots of SQL Query Builder products so we don't know which one.

You might do better looking at the web pages for whatever product you're 
talking about.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query Builder Access

2019-07-21 Thread Revere Perkins
Hi,
Is the SQL Query Builder only available in the licensed version? If it is
available in the free version, how does one access it? The documentation
says something like 'go to the Query Builder page' , but I don't find
mention of it under any menu or via an icon.
Thanks,
Revere
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query planner regression with FTS4: slower path is chosen

2019-07-19 Thread Paul
I have a test case when the regression can be observed in queries that
use JOINs with FTS4 tables, somewhere in between 3.22.0 and 3.23.0. 
For some reason the planner decides to search non-FTS table first then
scan the whole FTS table. Version 3.22.0 is the last unaffected, while
issue is still present in HEAD.

Probably it has something to do with a fact that, according to EXPLAIN,
new version of planner ignores LEFT join and considers it just a JOIN.
At least it feels that way, anyway.

Test case:


CREATE VIRTUAL TABLE search USING FTS4(text);

WITH RECURSIVE
 cnt(x) AS (
SELECT 1
UNION ALL
SELECT x+1 FROM cnt
 LIMIT 2000
 )
INSERT INTO search(docid, text) SELECT x, 'test' || x FROM cnt;

CREATE TABLE foo(s_docid integer primary key, bar integer);

WITH RECURSIVE
 cnt(x) AS (
SELECT 1
UNION ALL
SELECT x+1 FROM cnt
 LIMIT 2000
 )
INSERT INTO foo(s_docid, bar) SELECT x, 1 FROM cnt;

.timer on

-- Fast
SELECT COUNT() FROM search LEFT JOIN foo  
 ON s_docid = docid 
 WHERE bar = 1 AND search MATCH 'test*';

-- Fast
SELECT COUNT() FROM foo 
 WHERE bar = 1 
 AND s_docid IN (
   SELECT docid FROM search WHERE search MATCH 'test*'
 );

-- Create index, as some real-life queries use searches by `bar`
CREATE INDEX foo_bar_idx ON foo (bar);

-- Slow
SELECT COUNT() FROM search LEFT JOIN foo
 ON s_docid = docid 
 WHERE bar = 1 AND search MATCH 'test*';

-- As fast as before (current workaround)
SELECT COUNT() FROM foo 
 WHERE bar = 1 
 AND s_docid IN (
   SELECT docid FROM search WHERE search MATCH 'test*'
 );



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Covering index not chosen over primary key

2019-05-03 Thread Jen Pollock
Thank you for the suggestion!

The actual schema & query are a good deal more complicated, and I'm not
looking for general optimization help with them right now.

Jen Pollock

On Fri, May 03, 2019 at 10:11:04PM +0100, Simon Slavin wrote:
> On 3 May 2019, at 9:34pm, Jen Pollock  wrote:
> 
> >  SELECT filename
> >  FROM images
> >JOIN embedded_files ON images.file_id == embedded_files.id
> >  WHERE type == 'png';
> 
> Try this:
> 
> CREATE INDEX images (type, file_id);
> ANALYZE;
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Covering index not chosen over primary key

2019-05-03 Thread Simon Slavin
On 3 May 2019, at 9:34pm, Jen Pollock  wrote:

>  SELECT filename
>  FROM images
>JOIN embedded_files ON images.file_id == embedded_files.id
>  WHERE type == 'png';

Try this:

CREATE INDEX images (type, file_id);
ANALYZE;
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Covering index not chosen over primary key

2019-05-03 Thread Richard Hipp
On 5/3/19, Jen Pollock  wrote:
> I assume the problem here is that the primary key is usually a weird
> thing to index. I can definitely work around this, but I thought it
> might be worth reporting as something that could perhaps be improved in
> the query planner.

Thank you.  I have your request.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query planner: Covering index not chosen over primary key

2019-05-03 Thread Jen Pollock
I have a database with a schema roughly like this:

  CREATE TABLE embedded_files(
id INTEGER PRIMARY KEY,
filename TEXT,
data BLOB
  );
  CREATE TABLE images(
id INTEGER PRIMARY KEY,
file_id INTEGER,
type TEXT,
FOREIGN KEY(file_id) REFERENCES embedded_files(id)
  );

The following query is slow:

  SELECT filename
  FROM images
JOIN embedded_files ON images.file_id == embedded_files.id
  WHERE type == 'png';

Part of the problem is that many of the values in embedded_files.data
are quite large. I tried to improve the query's performance by creating a
covering index:

  CREATE INDEX embedded_files_id_filename ON embedded_files(id, filename);

However, the query planner won't use this index unless I force it to
with INDEXED BY. Forcing it to use the index does speed up the query.

I assume the problem here is that the primary key is usually a weird
thing to index. I can definitely work around this, but I thought it
might be worth reporting as something that could perhaps be improved in
the query planner.

Jen Pollock

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Regression IN and Virtual Tables - followup

2019-04-05 Thread Dan Kennedy


On 5/4/62 16:44, Hick Gunter wrote:

I patched my SQlite 3.24 code to include the fix from the ticket

<   if( rc==SQLITE_OK && (mBest = (pNew->prereq & ~mPrereq))!=0 ){
---

// from SQLite bugfix
  if( rc==SQLITE_OK && ((mBest = (pNew->prereq & ~mPrereq))!=0 || bIn) ){

and changed the xBestIndex return value to be lower if the equality constraint 
from IN is not usable

The generated code as reported is invalid (instruction 16 with the init of R6 
is not shown)



So, after applying the patch to 3.24 you executed the EXPLAIN statement 
shown below in the shell tool and it mysteriously omitted instruction 16 
from the output?


Are there any other problems? Does the SQL statement return the correct 
results if you execute it without the EXPLAIN?


Dan.






explain select lsn from atx_txlog where period_no between 7300 and 7313 and 
event_Type in (140001,180001);

addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 16000  Start at 16
1 VOpen  0 0 0 vtab:B90B5000
2 Explain2 0 0 SCAN TABLE atx_txlog VIRTUAL TABLE INDEX 
1:
   00
3 Integer7300  3 000  r[3]=7300
4 Integer7313  4 000  r[4]=7313
5 Integer1 1 000  r[1]=1
6 Integer2 2 000  r[2]=2
7 VFilter0 151
   00  iplan=r[1] zplan='
'
8   Noop   0 0 000  begin IN expr
9   VColumn0 15500  r[5]=vcolumn(15); 
atx_txlog.event_type
10  Eq 5 126 (BINARY)   43  if r[6]==r[5] goto 
12
11  Ne 5 147 (BINARY)   53  if r[7]!=r[5] goto 
14; end IN expr
12  VColumn0 21800  r[8]=vcolumn(21); 
atx_txlog.lsn
13  ResultRow  8 1 000  output=r[8]
14VNext  0 8 000
15Halt   0 0 000
17Integer180001  7 000  r[7]=180001
18Goto   0 1 000

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dan Kennedy
Gesendet: Freitag, 29. März 2019 14:30
An: sqlite-users@mailinglists.sqlite.org
Betreff: [EXTERNAL] Re: [sqlite] Query Regression IN and Virtual Tables


On 29/3/62 14:32, Hick Gunter wrote:

When upgrading from 3.7.14.1 to 3.24 I noticed the following problem

Given a virtual table like

CREATE VIRTUAL TABLE vt ( key1 INTEGER, key2 INTEGER, key3 INTEGER,
attr1 INTEGER,...);

whose xBestIndex function simulates (in unsupported syntax)

CREATE VIRTUAL INDEX vt_key ON vt (key1, key2, key3);

but also handles simple comparisons internally, the query

SELECT * FROM vt WHERE key1 =  AND key2 BETWEEN  AND
 AND attr1 IN ();


Thanks for reporting this.

What is supposed to happen in this scenario is that xBestIndex() is invoked 
once with all 4 constraints marked as usable. The IN(...) is represented as an 
SQLITE_INDEX_CONSTRAINT_EQ constraint. If the
xBestIndex() implementation elects to use the IN(...) operator, then
xBestIndex() is invoked a second time, this time with the IN(...) marked as not 
usable. SQLite evaluates both plans, considering the cost estimates provided by 
the virtual table implementation and its own estimate of the cardinality of the 
IN(...) operator. And chooses the most efficient plan overall.

There was a bug preventing the second call to xBestIndex() from being made in 
some circumstances - including for your query. Now fixed here:

https://sqlite.org/src/info/f5752517f590b37b

So if you upgrade to trunk, or else apply the patch linked above to 3.27.2, and 
the virtual table implementation provides relatively accurate cost estimates, 
SQLite should make an intelligent decision about which plan to use.

Dan.



SQLite 3.7.14.1 asks for (key1,'='), (key2,'>='), (key2,'<=') and
xBestIndex accepts all 3 constraints yielding query plan

- materialize IN  as anonymous ephemeral table
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
- retrieve column attr1
- search anonymous ephemeral table

i.e. perform a single partial table scan on vt and check attr1


SQLite3.24 asks for ...,(attr1,'=') and xBestIndex accepts all 4
constraints yielding

- materialize IN () as anonymous ephemeral table
- scan anonymous ephemeral table
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ? AND
attr1 = ?)

i.e. perform a partial table scan of vt FOR EACH attr1, which is
slower by the cardinality of the IN list

Fortunately, CTEs come to the rescue:

WITH attrs (attr1) AS (VALUES <(l

[sqlite] Query Regression IN and Virtual Tables - followup

2019-04-05 Thread Hick Gunter
I patched my SQlite 3.24 code to include the fix from the ticket

<   if( rc==SQLITE_OK && (mBest = (pNew->prereq & ~mPrereq))!=0 ){
---
> // from SQLite bugfix
>  if( rc==SQLITE_OK && ((mBest = (pNew->prereq & ~mPrereq))!=0 || bIn) ){

and changed the xBestIndex return value to be lower if the equality constraint 
from IN is not usable

The generated code as reported is invalid (instruction 16 with the init of R6 
is not shown)

explain select lsn from atx_txlog where period_no between 7300 and 7313 and 
event_Type in (140001,180001);

addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 16000  Start at 16
1 VOpen  0 0 0 vtab:B90B5000
2 Explain2 0 0 SCAN TABLE atx_txlog VIRTUAL TABLE INDEX 
1:
  00
3 Integer7300  3 000  r[3]=7300
4 Integer7313  4 000  r[4]=7313
5 Integer1 1 000  r[1]=1
6 Integer2 2 000  r[2]=2
7 VFilter0 151
  00  iplan=r[1] zplan='
'
8   Noop   0 0 000  begin IN expr
9   VColumn0 15500  r[5]=vcolumn(15); 
atx_txlog.event_type
10  Eq 5 126 (BINARY)   43  if r[6]==r[5] goto 
12
11  Ne 5 147 (BINARY)   53  if r[7]!=r[5] goto 
14; end IN expr
12  VColumn0 21800  r[8]=vcolumn(21); 
atx_txlog.lsn
13  ResultRow  8 1 000  output=r[8]
14VNext  0 8 000
15Halt   0 0 000
17Integer180001  7 000  r[7]=180001
18Goto   0 1 000

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dan Kennedy
Gesendet: Freitag, 29. März 2019 14:30
An: sqlite-users@mailinglists.sqlite.org
Betreff: [EXTERNAL] Re: [sqlite] Query Regression IN and Virtual Tables


On 29/3/62 14:32, Hick Gunter wrote:
> When upgrading from 3.7.14.1 to 3.24 I noticed the following problem
>
> Given a virtual table like
>
> CREATE VIRTUAL TABLE vt ( key1 INTEGER, key2 INTEGER, key3 INTEGER,
> attr1 INTEGER,...);
>
> whose xBestIndex function simulates (in unsupported syntax)
>
> CREATE VIRTUAL INDEX vt_key ON vt (key1, key2, key3);
>
> but also handles simple comparisons internally, the query
>
> SELECT * FROM vt WHERE key1 =  AND key2 BETWEEN  AND
>  AND attr1 IN ();


Thanks for reporting this.

What is supposed to happen in this scenario is that xBestIndex() is invoked 
once with all 4 constraints marked as usable. The IN(...) is represented as an 
SQLITE_INDEX_CONSTRAINT_EQ constraint. If the
xBestIndex() implementation elects to use the IN(...) operator, then
xBestIndex() is invoked a second time, this time with the IN(...) marked as not 
usable. SQLite evaluates both plans, considering the cost estimates provided by 
the virtual table implementation and its own estimate of the cardinality of the 
IN(...) operator. And chooses the most efficient plan overall.

There was a bug preventing the second call to xBestIndex() from being made in 
some circumstances - including for your query. Now fixed here:

   https://sqlite.org/src/info/f5752517f590b37b

So if you upgrade to trunk, or else apply the patch linked above to 3.27.2, and 
the virtual table implementation provides relatively accurate cost estimates, 
SQLite should make an intelligent decision about which plan to use.

Dan.


>
> SQLite 3.7.14.1 asks for (key1,'='), (key2,'>='), (key2,'<=') and
> xBestIndex accepts all 3 constraints yielding query plan
>
> - materialize IN  as anonymous ephemeral table
> - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
>- retrieve column attr1
>- search anonymous ephemeral table
>
> i.e. perform a single partial table scan on vt and check attr1
>
>
> SQLite3.24 asks for ...,(attr1,'=') and xBestIndex accepts all 4
> constraints yielding
>
> - materialize IN () as anonymous ephemeral table
> - scan anonymous ephemeral table
>- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ? AND
> attr1 = ?)
>
> i.e. perform a partial table scan of vt FOR EACH attr1, which is
> slower by the cardinality of the IN list
>
> Fortunately, CTEs come to the rescue:
>
> WITH attrs (attr1) AS (VALUES <(list)>) SELECT * FROM vt CROSS JOIN
> attrs a ON (a.attr1 = vt.attr1) WHERE key1 =  AND key2 BETWEEN
>  AND 
>
> This prevents SQLite 3.

Re: [sqlite] Query Regression IN and Virtual Tables

2019-03-29 Thread Dan Kennedy


On 29/3/62 14:32, Hick Gunter wrote:

When upgrading from 3.7.14.1 to 3.24 I noticed the following problem

Given a virtual table like

CREATE VIRTUAL TABLE vt ( key1 INTEGER, key2 INTEGER, key3 INTEGER, attr1 
INTEGER,...);

whose xBestIndex function simulates (in unsupported syntax)

CREATE VIRTUAL INDEX vt_key ON vt (key1, key2, key3);

but also handles simple comparisons internally, the query

SELECT * FROM vt WHERE key1 =  AND key2 BETWEEN  AND  AND attr1 IN 
();



Thanks for reporting this.

What is supposed to happen in this scenario is that xBestIndex() is 
invoked once with all 4 constraints marked as usable. The IN(...) is 
represented as an SQLITE_INDEX_CONSTRAINT_EQ constraint. If the 
xBestIndex() implementation elects to use the IN(...) operator, then 
xBestIndex() is invoked a second time, this time with the IN(...) marked 
as not usable. SQLite evaluates both plans, considering the cost 
estimates provided by the virtual table implementation and its own 
estimate of the cardinality of the IN(...) operator. And chooses the 
most efficient plan overall.


There was a bug preventing the second call to xBestIndex() from being 
made in some circumstances - including for your query. Now fixed here:


  https://sqlite.org/src/info/f5752517f590b37b

So if you upgrade to trunk, or else apply the patch linked above to 
3.27.2, and the virtual table implementation provides relatively 
accurate cost estimates, SQLite should make an intelligent decision 
about which plan to use.


Dan.




SQLite 3.7.14.1 asks for (key1,'='), (key2,'>='), (key2,'<=') and xBestIndex 
accepts all 3 constraints yielding query plan

- materialize IN  as anonymous ephemeral table
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
   - retrieve column attr1
   - search anonymous ephemeral table

i.e. perform a single partial table scan on vt and check attr1


SQLite3.24 asks for ...,(attr1,'=') and xBestIndex accepts all 4 constraints 
yielding

- materialize IN () as anonymous ephemeral table
- scan anonymous ephemeral table
   - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ? AND attr1 = 
?)

i.e. perform a partial table scan of vt FOR EACH attr1, which is slower by the 
cardinality of the IN list

Fortunately, CTEs come to the rescue:

WITH attrs (attr1) AS (VALUES <(list)>) SELECT * FROM vt CROSS JOIN attrs a ON (a.attr1 = 
vt.attr1) WHERE key1 =  AND key2 BETWEEN  AND 

This prevents SQLite 3.24 from adding the last constraint, yielding

- materialize IN (<(list)>) as epehemeral table attrs
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
   - retrieve column attr1
   - scan ephemeral table attrs

The only issue is that the previously generated ephemeral table was implemented 
as a covering index (allowing the sequence IfNull, Affinity, NotFound) whereas 
the named ephemeral table is implemented as a table (requiring a full table 
scan of the ephemeral table, even though at most 1 row can match)

Optimisation opportunity:

32  Rewind 1 40000
33Column 1 0 10   00  
r[10]=events.event_type
34VColumn0 1511   00  
r[11]=vcolumn(15); atx_txlog.event_type
35Ne 113810(BINARY)   53  if r[10]!=r[11] 
goto 38
36VColumn0 6 12   00  r[12]=vcolumn(6); 
atx_txlog.sync_offset
37ResultRow  121 000  output=r[12]
38  Next   1 33001

Could IMHO be rewritten as

32  VColumn0 1511   00  r[11]=vcolumn(15); 
atx_txlog.event_type
33  Rewind 1 40000
34Column 1 0 10   00  
r[10]=events.event_type
35Ne 113810(BINARY)   53  if r[10]!=r[11] 
goto 38
36VColumn0 6 12   00  r[12]=vcolumn(6); 
atx_txlog.sync_offset
37ResultRow  121 000  output=r[12]
38  Next   1 33001


___
  Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query Regression IN and Virtual Tables

2019-03-29 Thread Hick Gunter
When upgrading from 3.7.14.1 to 3.24 I noticed the following problem

Given a virtual table like

CREATE VIRTUAL TABLE vt ( key1 INTEGER, key2 INTEGER, key3 INTEGER, attr1 
INTEGER,...);

whose xBestIndex function simulates (in unsupported syntax)

CREATE VIRTUAL INDEX vt_key ON vt (key1, key2, key3);

but also handles simple comparisons internally, the query

SELECT * FROM vt WHERE key1 =  AND key2 BETWEEN  AND  AND 
attr1 IN ();

SQLite 3.7.14.1 asks for (key1,'='), (key2,'>='), (key2,'<=') and xBestIndex 
accepts all 3 constraints yielding query plan

- materialize IN  as anonymous ephemeral table
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
  - retrieve column attr1
  - search anonymous ephemeral table

i.e. perform a single partial table scan on vt and check attr1


SQLite3.24 asks for ...,(attr1,'=') and xBestIndex accepts all 4 constraints 
yielding

- materialize IN () as anonymous ephemeral table
- scan anonymous ephemeral table
  - search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ? AND attr1 = ?)

i.e. perform a partial table scan of vt FOR EACH attr1, which is slower by the 
cardinality of the IN list

Fortunately, CTEs come to the rescue:

WITH attrs (attr1) AS (VALUES <(list)>) SELECT * FROM vt CROSS JOIN attrs a ON 
(a.attr1 = vt.attr1) WHERE key1 =  AND key2 BETWEEN  AND 

This prevents SQLite 3.24 from adding the last constraint, yielding

- materialize IN (<(list)>) as epehemeral table attrs
- search vt using index 1 (key1 = ? AND key2 >= ? AND key2 <= ?)
  - retrieve column attr1
  - scan ephemeral table attrs

The only issue is that the previously generated ephemeral table was implemented 
as a covering index (allowing the sequence IfNull, Affinity, NotFound) whereas 
the named ephemeral table is implemented as a table (requiring a full table 
scan of the ephemeral table, even though at most 1 row can match)

Optimisation opportunity:

32  Rewind 1 40000
33Column 1 0 10   00  
r[10]=events.event_type
34VColumn0 1511   00  
r[11]=vcolumn(15); atx_txlog.event_type
35Ne 113810(BINARY)   53  if r[10]!=r[11] 
goto 38
36VColumn0 6 12   00  r[12]=vcolumn(6); 
atx_txlog.sync_offset
37ResultRow  121 000  output=r[12]
38  Next   1 33001

Could IMHO be rewritten as

32  VColumn0 1511   00  r[11]=vcolumn(15); 
atx_txlog.event_type
33  Rewind 1 40000
34Column 1 0 10   00  
r[10]=events.event_type
35Ne 113810(BINARY)   53  if r[10]!=r[11] 
goto 38
36VColumn0 6 12   00  r[12]=vcolumn(6); 
atx_txlog.sync_offset
37ResultRow  121 000  output=r[12]
38  Next   1 33001


___
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-18 Thread niklas
Thanks to all who have replied, very informative! :)

This is just a database for own personal use so it's not a big deal in any
way, mainly trying to get a better understanding of how Sqlite works here.

I'll note that the sql queries are not static inside my application but they
are generated dynamically from command line arguments. 
Basically any column can be added to the SELECT, WHERE and ORDER BY clauses
at runtime, and some columns will be generated from sub-queries (via JOIN:s)
as shown in first post.

As the correlated sub-queries will be executed twice if used in the WHERE
clause it seems that using JOIN:s is preferable for my use cases.

Some further experimentation shows that using LEFT JOIN for the subqueries
instead of INNER JOIN will always make the query planner do the "right"
thing, i.e. use automatic indexes instead of table scans. Regardless of
ANALYZE information being present or not. 

So that is maybe a better work-around than removing the ANALYZE tables. LEFT
JOIN and INNER JOIN will always return the same results in this case as all
books will (or should) have dates, genres and authors, and if they do not
then I probably want LEFT JOIN semantics anyway to better notice it. I
currently use INNER JOIN to give the query planner more freedom in selecting
query plans. 

Still would be interesting to know why Sqlite went for plain table scans in
the initial case. Seems that using automatic indexes will always be faster
(N*logN vs N*N), so why not always use them when possible? Acccording to the
documentation Sqlite assumes N is a million without analyze information, and
in that case it opted to generate automatic indexes. In my case with ANALYZE
information present N will be around 3000, and then it opted for table
scans. The final query took over 24 minutes with all 3 sub-query columns
present when run to completion though, so obviously the wrong choice since
the loops ended up being nested three or more times.

(I understand that the query planner must take many different scenarios into
account and generate good plans for all of them, so this is most likely an
unfortunate edge case.)

Also noticed another case where Sqlite uses nested table scanning for JOIN:s
and this time it was not instead of automatic indexes, apparently it opted
for nested scans to avoid using a temp b-tree in the ORDER BY. (This is part
of co-routine for a window function using the AuthorID for partitions.)

Slow nested table scan (execution time measured in seconds):

|  |  |--SCAN TABLE Authors
|  |  |--SCAN TABLE DatesRead
|  |  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=? AND
AuthorID=?)
|  |  |--SEARCH SUBQUERY 1 ...
|  |  `--SEARCH SUBQUERY 2 ...

vs temp b-tree (execution time measured in milliseconds):

|  |  |--SCAN TABLE AuthorBooks
|  |  |--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|  |  |--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|  |  |--SEARCH SUBQUERY 1 ...
|  |  |--SEARCH SUBQUERY 2 ...
|  |  `--USE TEMP B-TREE FOR ORDER BY

This is part of a larger query but I have not included all tables used in it
so just show parts that differ here, everything else in the two queries is
identical.
Dropping the ANALYZE information makes the query planner select the faster
alternative here as well.
I can provide more information about the query in case anyone is interested,
just included these parts now to illustrate the "problematic" nested scans.

(Not really that problematic, this just came up in a test that iterated over
all supported columns for all main queries, in actual use of the application
I would 
hardly run it, but still an interesting case I think.)




--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-17 Thread Keith Medcalf

On Sunday, 17 March, 2019 11:19, niklas  wrote:

>I agree that correlated subqueries in general seem more natural and
>are probably also less likely to have the performance pessimizations
>noticed with joins.

>But I might also want to use the column, or in case of a correlated
>subquery, the column alias, in the WHERE clause and previously that
>has not been supported as I recall. Maybe also not allowed by SQL
>specification itself?

>I modified the suggested query and used the column alias in WHERE now
>though, and it seemed to work!

Yes and no and it "seemed to work" is an adequate description.  You have to 
realize that the list of things to get (the SELECT arguments) are executed only 
after the WHERE clause is executed.  That is to say, the semantics of the 
SELECT statement is:

SELECT some stuff
  FROM 
 WHERE 

so that except in the case where the alias in the select list is a simple alias 
for a column name, the computation will be made twice.  So for example if you 
did something like:

  select BookID,
 Title,
 (
   SELECT group_concat(ltrim("First Name" || ' ' || "Last Name"),',')
 FROM AuthorBooks JOIN Authors USING (AuthorID)
WHERE BookID == Books.BookID
 ) as "Author(s)",
 (
   SELECT group_concat("Date read",', ')
 FROM DatesRead
WHERE BookID == Books.BookID
 ) as "Date(s)",
 (
   SELECT group_concat(Genre,', ')
 FROM BookGenres JOIN Genres USING (GenreID)
WHERE BookID == Books.BookID
 ) AS "Genre(s)"
FROM Books
   WHERE "Author(s)" IN NOT NULL
 AND "Date(s)" IS NOT NULL
 AND "Genre(s)" IS NOT NULL
ORDER BY BookID;

then you are executing the correlated subquery's TWICE each, once for the WHERE 
clause and once for the SELECT clause.  If you want to ensure that those values 
are not null, and do not want to execute the correlates twice, you need to do 
something like this which will execute the correlates only for the books that 
would not have null results is those three columns (without doing the duplicate 
group_concat).

But the inner join of the subqueries will still be more performant IF THE QUERY 
PLANNER USES AUTOMATIC INDEXES.

  select BookID,
 Title,
 (
   SELECT group_concat(ltrim("First Name" || ' ' || "Last Name"),',')
 FROM AuthorBooks JOIN Authors USING (AuthorID)
WHERE BookID == Books.BookID
 ) as "Author(s)",
 (
   SELECT group_concat("Date read",', ')
 FROM DatesRead
WHERE BookID == Books.BookID
 ) as "Date(s)",
 (
   SELECT group_concat(Genre,', ')
 FROM BookGenres JOIN Genres USING (GenreID)
WHERE BookID == Books.BookID
 ) AS "Genre(s)"
FROM Books
   WHERE BookID in (
   SELECT BookID FROM AuthorBooks JOIN Authors USING 
(AuthorID)
INTERSECT
   SELECT BookID FROM DatesRead
INTERSECT
   SELECT BookID FROM BookGenres JOIN Genres USING (GenreID)
   )
ORDER BY BookID;

>Is this a recent change in Sqlite or have I misunderstood something?
>The Sqlite documentation still does not seem to say that column aliases
>can be used in the WHERE clause at least.

Well, I think this was added somewhere along the way.  Remember they are 
ALIASES and the original text is substituted for them.

Of course, you could always just retrieve all the data and ignore the rows you 
do not want at the application level ...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-17 Thread Simon Slavin
On 17 Mar 2019, at 5:19pm, niklas  wrote:

> Is this a recent change in Sqlite or have I misunderstood something? The 
> Sqlite documentation still does not seem to say that column aliases can be 
> used in the WHERE clause at least.

You are correct in two things.  Column aliases cannot be relied on inside the 
WHERE clause or any other clause.  You should imagine that they are used only 
when the SELECT statement is ready to return values.

And also that some implementations of SQL allow the 'AS' names to be used for 
other purposes, but this is not in the SQL specification.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-17 Thread niklas
I agree that correlated subqueries in general seem more natural and are
probably also less likely to have the performance pessimizations noticed
with joins.

But I might also want to use the column, or in case of a correlated
subquery, the column alias, in the WHERE clause and previously that has not
been supported as I recall. Maybe also not allowed by SQL specification
itself?

I modified the suggested query and used the column alias in WHERE now
though, and it seemed to work! 

Is this a recent change in Sqlite or have I misunderstood something? The
Sqlite documentation still does not seem to say that column aliases can be
used in the WHERE clause at least.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-16 Thread Simon Slavin
On 15 Mar 2019, at 7:02pm, niklas  wrote:

> The data used for sqlite_stat1 in create.txt is taken from the real data, 
> it's copied from the sql-dump generated just after running ANALYZE.

Okay.  I should have guessed that.  Sorry for doubting you.

You seem to have figured out a work-around for now.  Good luck with it.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-16 Thread niklas
The data used for sqlite_stat1 in create.txt is taken from the real data,
it's copied from the sql-dump generated just after running ANALYZE.

I only wanted to include the minimum amount of data the demonstrate the
issue so I omitted all other tables, views and data. As I understand it
sqlite only checks the statN table data when planning the queries and not
the actual data in the real tables.

I have dropped the sqlite_stat1 table for now and so disabled analyze
functionality. This way I avoid the slow queries. Running ANALYZE on the
database again will immediately make them return though.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-15 Thread Keith Medcalf

Of course, the correlated version returns all books, not just the ones that 
have at least one author, were read at least once, and have at least one genre. 
 

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Keith Medcalf
>Sent: Friday, 15 March, 2019 14:44
>To: SQLite mailing list
>Subject: Re: [sqlite] Query planner: Scanning subqueries vs using
>automatic covering index
>
>
>The current tip of trunk produced the same results demonstrated by
>Niklas in his original post for his original testcase for me.
>
>I would have written the query as a simple query with correlated
>subqueries to get the concatenated data as it is a "more natural"
>declaration of what is wanted (in my opinion anyway), rather than re-
>writing into joins.
>
>  select BookID,
> Title,
> (
>   SELECT group_concat(ltrim("First Name" || ' ' || "Last
>Name"),',')
> FROM AuthorBooks JOIN Authors USING (AuthorID)
>WHERE BookID == Books.BookID
> ) as "Author(s)",
> (
>   SELECT group_concat("Date read",', ')
> FROM DatesRead
>WHERE BookID == Books.BookID
> ) as "Date(s)",
> (
>   SELECT group_concat(Genre,', ')
> FROM BookGenres JOIN Genres USING (GenreID)
>WHERE BookID == Books.BookID
> ) AS "Genre(s)"
>FROM Books
>ORDER BY BookID;
>
>---
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-Original Message-----
>>From: sqlite-users [mailto:sqlite-users-
>>boun...@mailinglists.sqlite.org] On Behalf Of niklas
>>Sent: Friday, 15 March, 2019 01:36
>>To: sqlite-users@mailinglists.sqlite.org
>>Subject: [sqlite] Query planner: Scanning subqueries vs using
>>automatic covering index
>>
>>I recently noticed some very slow queries for my sqlite book
>>database.
>>Turns out that it happened due to the query planner decided to scan
>>sub-queries instead of using an automatic covering index to search
>>them.
>>
>>The database contains about 3000 entries and with two subqueries it
>>took a noticable time to run, with three subqueries I had to give up
>>and
>>abort to save the CPU from overheating.
>>
>>Using indexes only take milliseconds in all cases.
>>
>>Dropping analyze (sqlite_stat1) information from the database made
>>the problem go away, i.e. made sqlite use index instead of scanning.
>>As did adding a few more columns to the ORDER BY part.
>>
>>These sql statements should be enough to demonstrate the problem.
>>I am using sqlite 3.25.1
>>
>>**
>>create.txt
>>**
>>
>>BEGIN;
>>
>>CREATE TABLE "Authors" (
>>"AuthorID" INTEGER PRIMARY KEY,
>>"Last Name" TEXT NOT NULL,
>>"First Name" TEXT NOT NULL);
>>
>>CREATE TABLE "Books" (
>>"BookID" INTEGER PRIMARY KEY,
>>"Title" TEXT NOT NULL);
>>
>>CREATE TABLE "Genres"(
>>"GenreID" INTEGER PRIMARY KEY,
>>"Genre" TEXT UNIQUE NOT NULL);
>>
>>CREATE TABLE "DatesRead"(
>>"BookID" INTEGER,
>>"Date Read" TEXT,
>>PRIMARY KEY("BookID", "Date Read"),
>>FOREIGN KEY(BookID) REFERENCES Books(BookID)) WITHOUT ROWID;
>>
>>CREATE TABLE "AuthorBooks"(
>>"BookID" INTEGER,
>>"AuthorID" INTEGER,
>>PRIMARY KEY("BookID", "AuthorID" ),
>>FOREIGN KEY(BookID) REFERENCES Books(BookID),
>>FOREIGN KEY(AuthorID) REFERENCES Authors(AuthorID)) WITHOUT
>>ROWID;
>>
>>CREATE TABLE "BookGenres"(
>>"BookID" INTEGER,
>>"GenreID" INTEGER,
>>PRIMARY KEY("BookID", "GenreID" ),
>>FOREIGN KEY(BookID) REFERENCES Books(BookID),
>>FOREIGN KEY(GenreID) REFERENCES Genres(GenreID)) WITHOUT ROWID;
>>
>>ANALYZE;
>>INSERT INTO sqlite_stat1 VALUES('BookGenres','BookGenres','3190 2
>>1');
>>INSERT INTO sqlite_stat1 VALUES('DatesRead','DatesRead','3047 2 1');
>>INSERT INTO sqlite_stat1 VALUES('AuthorBooks','AuthorBooks','3549 2
>>1');
>>INSERT INTO sqlite_stat1 VA

Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-15 Thread Keith Medcalf

The current tip of trunk produced the same results demonstrated by Niklas in 
his original post for his original testcase for me.

I would have written the query as a simple query with correlated subqueries to 
get the concatenated data as it is a "more natural" declaration of what is 
wanted (in my opinion anyway), rather than re-writing into joins.

  select BookID,
 Title,
 (
   SELECT group_concat(ltrim("First Name" || ' ' || "Last Name"),',')
 FROM AuthorBooks JOIN Authors USING (AuthorID)
WHERE BookID == Books.BookID
 ) as "Author(s)",
 (
   SELECT group_concat("Date read",', ')
 FROM DatesRead
WHERE BookID == Books.BookID
 ) as "Date(s)",
 (
   SELECT group_concat(Genre,', ')
 FROM BookGenres JOIN Genres USING (GenreID)
WHERE BookID == Books.BookID
 ) AS "Genre(s)"
FROM Books
ORDER BY BookID;

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of niklas
>Sent: Friday, 15 March, 2019 01:36
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Query planner: Scanning subqueries vs using
>automatic covering index
>
>I recently noticed some very slow queries for my sqlite book
>database.
>Turns out that it happened due to the query planner decided to scan
>sub-queries instead of using an automatic covering index to search
>them.
>
>The database contains about 3000 entries and with two subqueries it
>took a noticable time to run, with three subqueries I had to give up
>and
>abort to save the CPU from overheating.
>
>Using indexes only take milliseconds in all cases.
>
>Dropping analyze (sqlite_stat1) information from the database made
>the problem go away, i.e. made sqlite use index instead of scanning.
>As did adding a few more columns to the ORDER BY part.
>
>These sql statements should be enough to demonstrate the problem.
>I am using sqlite 3.25.1
>
>**
>create.txt
>**
>
>BEGIN;
>
>CREATE TABLE "Authors" (
>"AuthorID" INTEGER PRIMARY KEY,
>"Last Name" TEXT NOT NULL,
>"First Name" TEXT NOT NULL);
>
>CREATE TABLE "Books" (
>"BookID" INTEGER PRIMARY KEY,
>"Title" TEXT NOT NULL);
>
>CREATE TABLE "Genres"(
>"GenreID" INTEGER PRIMARY KEY,
>"Genre" TEXT UNIQUE NOT NULL);
>
>CREATE TABLE "DatesRead"(
>"BookID" INTEGER,
>"Date Read" TEXT,
>PRIMARY KEY("BookID", "Date Read"),
>FOREIGN KEY(BookID) REFERENCES Books(BookID)) WITHOUT ROWID;
>
>CREATE TABLE "AuthorBooks"(
>"BookID" INTEGER,
>"AuthorID" INTEGER,
>PRIMARY KEY("BookID", "AuthorID" ),
>FOREIGN KEY(BookID) REFERENCES Books(BookID),
>FOREIGN KEY(AuthorID) REFERENCES Authors(AuthorID)) WITHOUT
>ROWID;
>
>CREATE TABLE "BookGenres"(
>"BookID" INTEGER,
>"GenreID" INTEGER,
>PRIMARY KEY("BookID", "GenreID" ),
>FOREIGN KEY(BookID) REFERENCES Books(BookID),
>FOREIGN KEY(GenreID) REFERENCES Genres(GenreID)) WITHOUT ROWID;
>
>ANALYZE;
>INSERT INTO sqlite_stat1 VALUES('BookGenres','BookGenres','3190 2
>1');
>INSERT INTO sqlite_stat1 VALUES('DatesRead','DatesRead','3047 2 1');
>INSERT INTO sqlite_stat1 VALUES('AuthorBooks','AuthorBooks','3549 2
>1');
>INSERT INTO sqlite_stat1 VALUES('Authors',NULL,'1329');
>INSERT INTO sqlite_stat1 VALUES('Books',NULL,'2978');
>INSERT INTO sqlite_stat1
>VALUES('Genres','sqlite_autoindex_Genres_1','112
>1');
>
>COMMIT;
>
>
>queries.txt
>
>
>select "";
>select "order by bookid only";
>select "";
>
>EXPLAIN QUERY PLAN
>SELECT BookID,Title,"Author(s)","Date(s)","Genre(s)"
>FROM Books
>JOIN (SELECT BookID, group_concat(ltrim("First Name"||' '||"Last
>Name"),',
>') AS 'Author(s)' FROM Books JOIN AuthorBooks USING(BookID) JOIN
>Authors
>USING(AuthorID) GROUP BY BookID) USING(BookID)
>JOIN (SELECT BookID, group_concat("Date read",', ') AS 'Date(s)' FROM
>Books
>JOIN DatesRead USING(BookID) GROUP BY BookID) USING(BookID)
>JOIN (SELECT BookID, group_concat(Genre,', ') AS 'Genre(s)' FROM
>Books JOIN
>BookGenres USING(BookID) JOIN

Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-15 Thread Simon Slavin
On 15 Mar 2019, at 7:35am, niklas  wrote:

> Dropping analyze (sqlite_stat1) information from the database made 
> the problem go away, i.e. made sqlite use index instead of scanning. 

Instead of dropping ANALYZE information, did you try updating it instead, by 
running ANALYZE again ?

Ah ...

> ANALYZE;
> INSERT INTO sqlite_stat1 VALUES('BookGenres','BookGenres','3190 2 1');

How do we know your values for sqlite_stat1 are right ?  Why aren't you just 
leaving ANALZE to do its job ?

Also, you are missing out on part of what ANALYZE does.  As well as looking at 
your schema, ANALYZE also considers the 'chunkiness' of the data in each 
indexed column.  Run it with data in the table, with the data as convincing as 
possible in terms of number of rows and the values in each column.

In your create.txt I would insert data in your tables, then put ANALYZE just 
before COMMIT.

Curious to know if this improves matters for you.  I'm not certain.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-15 Thread niklas
I recently noticed some very slow queries for my sqlite book database. 
Turns out that it happened due to the query planner decided to scan 
sub-queries instead of using an automatic covering index to search them. 

The database contains about 3000 entries and with two subqueries it 
took a noticable time to run, with three subqueries I had to give up and 
abort to save the CPU from overheating. 

Using indexes only take milliseconds in all cases.

Dropping analyze (sqlite_stat1) information from the database made 
the problem go away, i.e. made sqlite use index instead of scanning. 
As did adding a few more columns to the ORDER BY part.

These sql statements should be enough to demonstrate the problem. 
I am using sqlite 3.25.1

**
create.txt
**

BEGIN;

CREATE TABLE "Authors" (
"AuthorID" INTEGER PRIMARY KEY,
"Last Name" TEXT NOT NULL,
"First Name" TEXT NOT NULL);

CREATE TABLE "Books" (
"BookID" INTEGER PRIMARY KEY,
"Title" TEXT NOT NULL);
 
CREATE TABLE "Genres"(
"GenreID" INTEGER PRIMARY KEY,
"Genre" TEXT UNIQUE NOT NULL);

CREATE TABLE "DatesRead"(
"BookID" INTEGER,
"Date Read" TEXT,
PRIMARY KEY("BookID", "Date Read"),
FOREIGN KEY(BookID) REFERENCES Books(BookID)) WITHOUT ROWID;

CREATE TABLE "AuthorBooks"(
"BookID" INTEGER,
"AuthorID" INTEGER,
PRIMARY KEY("BookID", "AuthorID" ),
FOREIGN KEY(BookID) REFERENCES Books(BookID),
FOREIGN KEY(AuthorID) REFERENCES Authors(AuthorID)) WITHOUT ROWID;

CREATE TABLE "BookGenres"(
"BookID" INTEGER,
"GenreID" INTEGER,  
PRIMARY KEY("BookID", "GenreID" ),
FOREIGN KEY(BookID) REFERENCES Books(BookID),
FOREIGN KEY(GenreID) REFERENCES Genres(GenreID)) WITHOUT ROWID;

ANALYZE;
INSERT INTO sqlite_stat1 VALUES('BookGenres','BookGenres','3190 2 1');
INSERT INTO sqlite_stat1 VALUES('DatesRead','DatesRead','3047 2 1');
INSERT INTO sqlite_stat1 VALUES('AuthorBooks','AuthorBooks','3549 2 1');
INSERT INTO sqlite_stat1 VALUES('Authors',NULL,'1329');
INSERT INTO sqlite_stat1 VALUES('Books',NULL,'2978');
INSERT INTO sqlite_stat1 VALUES('Genres','sqlite_autoindex_Genres_1','112
1');

COMMIT;


queries.txt


select "";
select "order by bookid only";
select "";

EXPLAIN QUERY PLAN
SELECT BookID,Title,"Author(s)","Date(s)","Genre(s)"
FROM Books
JOIN (SELECT BookID, group_concat(ltrim("First Name"||' '||"Last Name"),',
') AS 'Author(s)' FROM Books JOIN AuthorBooks USING(BookID) JOIN Authors
USING(AuthorID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat("Date read",', ') AS 'Date(s)' FROM Books
JOIN DatesRead USING(BookID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat(Genre,', ') AS 'Genre(s)' FROM Books JOIN
BookGenres USING(BookID) JOIN Genres USING(GenreID) GROUP BY BookID)
USING(BookID)
ORDER BY BookID;

select "---";
select "order by bookid and some other fields too";
select "--";

EXPLAIN QUERY PLAN
SELECT BookID,Title,"Author(s)","Date(s)","Genre(s)"
FROM Books
JOIN (SELECT BookID, group_concat(ltrim("First Name"||' '||"Last Name"),',
') AS 'Author(s)' FROM Books JOIN AuthorBooks USING(BookID) JOIN Authors
USING(AuthorID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat("Date read",', ') AS 'Date(s)' FROM Books
JOIN DatesRead USING(BookID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat(Genre,', ') AS 'Genre(s)' FROM Books JOIN
BookGenres USING(BookID) JOIN Genres USING(GenreID) GROUP BY BookID)
USING(BookID)
ORDER BY BookID, Title, "Author(s)", "Date(s)";



Test run:


del test.db
type create.txt | sqlite3 test.db
type queries.txt | sqlite3 test.db



Output:



order by bookid only

QUERY PLAN
|--MATERIALIZE 1
|  |--SCAN TABLE Books
|  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|--MATERIALIZE 2
|  |--SCAN TABLE Books
|  `--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|--MATERIALIZE 3
|  |--SCAN TABLE Books
|  |--SEARCH TABLE BookGenres USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Genres USING INTEGER PRIMARY KEY (rowid=?)
|--SCAN SUBQUERY 1
|--SCAN SUBQUERY 2
|--SCAN SUBQUERY 3
|--SEARCH TABLE Books USING INTEGER PRIMARY KEY (rowid=?)
`--USE TEMP B-TREE FOR ORDER BY
---
order by bookid and some other fields too
--
QUERY PLAN
|--MATERIALIZE 1
|  |--SCAN TABLE Books
|  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|--MATERIALIZE 2
|  |--SCAN TABLE Books
|  `--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|--MATERIALIZE 3
|  |--SCAN TABLE Books
|  |--SEARCH TABLE BookGenres USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Genres USING INTEGER 

Re: [sqlite] How to do I get an 'AND' condition to work in an SQLite query?

2019-01-31 Thread Simon Slavin
On 31 Jan 2019, at 6:21pm, Scott  wrote:

> Figured it out! I had set the column Deleted to "CHAR" but all the fields 
> without 'X' were null. If I replaced null with a valid character it worked.
> Thanks for your time.

Ah, JOINing on NULL.  Well done.

For future reference, SQLite doesn't have a CHAR type.  For clarity you might 
want to use TEXT instead.  SQLite does assume TEXT when you specify CHAR, but 
you might be depending on a specifically CHAR behaviour, like truncation to one 
character, and SQLite will ignore it.

Well done for solving your problem.  The magic of posting.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to do I get an 'AND' condition to work in an SQLite query?

2019-01-31 Thread Scott
Figured it out! I had set the column Deleted to "CHAR" but all the fields 
without 'X' were null. If I replaced null with a valid character it worked.
Thanks for your time.
Scott ValleryEcclesiastes 4:9-10 

On Thursday, January 31, 2019, 12:46:34 PM EST, Scott 
 wrote:  
 
 I can return results successfully from the t.Topic and n.Deleted columns 
separately, but when I try to use AND I receive no results. I'm not sure what I 
may be doing wrong. This is my first exhaustive work with a database project, 
so I've had to learn some syntax along the way, but has me stumped.
SELECT n.NoteID, s.SourceType, s.Title, c.Summary FROM Comment as c
LEFT JOIN Notes as n ON n.CommentID = c.CommentID

LEFT JOIN Source as s ON n.SourceID = s.SourceID

LEFT JOIN Topic as t ON n.TopicID = t.TopicID

WHERE (t.Topic = 'Manuscript Copies') AND (n.Deleted <> 'X')



I've tried and even with INNER JOIN:
(WHERE t.Topic = 'Manuscript Copies') AND n.Deleted <> 'X')WHERE (t.Topic = 
'Manuscript Copies') AND (n.Deleted <> 'X')
WHERE t.Topic = 'Manuscript Copies' AND n.Deleted <> 'X'
Thanks,
Scott ValleryEcclesiastes 4:9-10  
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to do I get an 'AND' condition to work in an SQLite query?

2019-01-31 Thread Simon Slavin
On 31 Jan 2019, at 5:46pm, Scott  wrote:

> I can return results successfully from the t.Topic and n.Deleted columns 
> separately, but when I try to use AND I receive no results.

There was an optimization bug that looked like your example in some previous 
version of SQLite.  Are you running an up-to-date version of SQLite ?

Are you doing this in your own code or in the SQLite CLI tool ?  If possible, 
please test in the CLI tool.

Please execute "PRAGMA schema.integrity_check" just in case your database is 
corrupt.

Try removing everything that doesn't cause the problem.  So remove the JOIN to 
the Source table.

Swap the order of the conditions around the "AND".

Swap the order of the JOIN clauses.

Some combination of the above should tell you more about the problem.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to do I get an 'AND' condition to work in an SQLite query?

2019-01-31 Thread Scott
I can return results successfully from the t.Topic and n.Deleted columns 
separately, but when I try to use AND I receive no results. I'm not sure what I 
may be doing wrong. This is my first exhaustive work with a database project, 
so I've had to learn some syntax along the way, but has me stumped.
SELECT n.NoteID, s.SourceType, s.Title, c.Summary FROM Comment as c
LEFT JOIN Notes as n ON n.CommentID = c.CommentID

LEFT JOIN Source as s ON n.SourceID = s.SourceID

LEFT JOIN Topic as t ON n.TopicID = t.TopicID

WHERE (t.Topic = 'Manuscript Copies') AND (n.Deleted <> 'X')



I've tried and even with INNER JOIN:
(WHERE t.Topic = 'Manuscript Copies') AND n.Deleted <> 'X')WHERE (t.Topic = 
'Manuscript Copies') AND (n.Deleted <> 'X')
WHERE t.Topic = 'Manuscript Copies' AND n.Deleted <> 'X'
Thanks,
Scott ValleryEcclesiastes 4:9-10
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planning Knowledge

2019-01-23 Thread Clemens Ladisch
Andy Bennett wrote:
>> foreign key constraints
>
> my experience with other engines taught me that it makes experimenting at the 
> monitor harder.

Then don't use them. :)  But do you actually want 'wrong' data?

> Are there any efficiency benefits or is it just there to enforce data 
> integrity?

Constraints just are additional checks.
(FKs require certain indexes, but you would want to have those anyway.)

> It looks like they have to be enabled on a per connection basis. In this case 
> I (currently)
> control all the client code but is it possible for the foreign key 
> relationships to get out
> of sync if one of the connections omits to apply the pragma?

Yes.  You could run PRAGMA foreign_key_check afterwards.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planning Knowledge

2019-01-23 Thread Andy Bennett

Hi,


I could use the inner join for the "entrys" join and the "items" join
but not the "entry-items" join because each entry can have more than
one item.


  WITH a(id, name) AS (VALUES (1, 'A')),
   b(id, name) AS (VALUES (1, 'B1'), (1, 'B2'))
  SELECT * FROM a INNER JOIN b USING (id);

  1|A|B1
  1|A|B2

The only difference between inner and outer joins is how rows without
any match are handled.


Thanks Clemens! In my case entries can have zero items as well and I still 
want the entry itself to show up.




I started with an OUTER JOIN as I find it easier to show that it's
doing the correct thing because I can search the output for errant
NULLs. Trying to detect missing rows in an INNER JOIN is harder.


If the join columns have the same name, using USING is easier.

And it would be a good idea to enforce the relationships between the
tables with foreign key constraints: 


(However, constraints do not affect how you have to write your queries.)


Ah yes. It might be worth looking at this. I've always avoided it in the 
past because my experience with other engines taught me that it makes 
experimenting at the monitor harder. Are there any efficiency benefits or 
is it just there to enforce data integrity (very important, of course;-))?


It looks like they have to be enabled on a per connection basis. In this 
case I (currently) control all the client code but is it possible for the 
foreign key relationships to get out of sync if one of the connections 
omits to apply the pragma?



Thanks for the tips!




Regards,
@ndy

--
andy...@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planning Knowledge

2019-01-22 Thread Clemens Ladisch
Andy Bennett wrote:
> I could use the inner join for the "entrys" join and the "items" join
> but not the "entry-items" join because each entry can have more than
> one item.

  WITH a(id, name) AS (VALUES (1, 'A')),
   b(id, name) AS (VALUES (1, 'B1'), (1, 'B2'))
  SELECT * FROM a INNER JOIN b USING (id);

  1|A|B1
  1|A|B2

The only difference between inner and outer joins is how rows without
any match are handled.

> I started with an OUTER JOIN as I find it easier to show that it's
> doing the correct thing because I can search the output for errant
> NULLs. Trying to detect missing rows in an INNER JOIN is harder.

If the join columns have the same name, using USING is easier.

And it would be a good idea to enforce the relationships between the
tables with foreign key constraints: 
(However, constraints do not affect how you have to write your queries.)


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planning Knowledge

2019-01-22 Thread Andy Bennett

Hi David,

Thanks for your thoughtful reply.



Can't go into as much detail as you. But a couple comments.

"primary key unique" is redundant, and will actually create a 
redundant unique index.


Are you refering to the CREATE TABLE clauses for the "items" and 
"registers" tables?



I appear to have indexes named "sqlite_autoindex_items_1", 
"sqlite_autoindex_items_2" (for the blob column) and 
"sqlite_autoindex_registers_1".


If I run

CREATE TABLE "items2" ("item-id" INTEGER PRIMARY KEY  NOT NULL   , "blob" 
BLOB NOT NULL  UNIQUE )


...then I just get "sqlite_auto_index_items2_1" for the blob colum.


Thanks for that: it's a good find! I'll do the same for the "registers" 
table as well.




"Do later version of SQLite manage to spot the constant 
propagation opportunity?"
This requires way more algebra and proof than I think you 
realize. "entrys" is on the right side of a LEFT OUTER JOIN, and 
therefore may be null when it comes time to do the next OUTER 
JOIN to "entry-items", so a direct replacement of an "equals" 
constraint isn't possible. And after that it again starts 
becoming algebra as to whether that null can affect things etc.


Yes, you are right. I could use the inner join for the "entrys" join and 
the "items" join but not the "entry-items" join because each entry can have 
more than one item.


Changing the "entrys" join to an INNER JOIN and adding both ORDER BY 
clauses gets me a much better query plan:


-
1|0|0|SEARCH TABLE entrys USING COVERING INDEX 
entrys-log-id-region-key-entry-number (log-id=? AND region=?)
0|0|1|SEARCH TABLE entrys USING INDEX entrys-log-id-region-key-entry-number 
(log-id=? AND region=?)
0|1|0|SEARCH SUBQUERY 1 AS specific-entrys USING AUTOMATIC COVERING INDEX 
(key=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX 
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)

0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-

Changing the "items" join to an INNER JOIN doesn't get me any more but I'll 
do it anyway as the OUTER JOIN is not strictly needed.



Thanks for that!

I started with an OUTER JOIN as I find it easier to show that it's doing 
the correct thing because I can search the output for errant NULLs. Trying 
to detect missing rows in an INNER JOIN is harder. I then failed to realise 
that an INNER JOIN would suffice.


It now even does the constant propagation of the log-id at (** 5 **) but 
not the "entry-number" one!




For the ordering, I recommend seeing if you can replace one of 
those "entrys" indexes so that they start with "key" as the 
first field in the index. That would at least give it more 
opportunity to use that index for the ordering rather than 
needing to order things after they're all collected. That and 
explicitly stating the order by in _both_ the sub select and the 
final might make it notice "I can use this _ordered_ sub select 
as the outer table in the joins and get my overall ordering that 
way"... or it may not. Worth a try though.


Yeah. I've had a fiddle around and can't get it to do it. However, 
converting to an INNER JOIN as above seems to get it to work without 
changing the indexes tho'.


In the schema I have,

CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number" ON "entrys" ( 
"log-id" ASC, "region" ASC, "key" ASC, "entry-number" ASC)


...which seems to get used for the sub-query. It gives it its ordering and 
the "key" part also gets used when I add the extra WHERE constraint on 
"key" (** 2**) so it seems to be having the effect you mention.


If I add an ORDER BY clause to the sub-select (** 3 **) the query plan 
doesn't change so I guess it's happy to use that index to get the ordering.


If I then also add an ORDER BY to the main select it still uses a temporary 
b-tree to confirm the sort. This is the main source of my confusion because 
the query plan shows that the joins are all executed as inner loops so the 
ordering of the outer query should be preserved




Thanks for your help David and thanks for solving the problem!




Regards,
@ndy

--
andy...@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query Planning Knowledge

2019-01-22 Thread Keith Medcalf

Your description of the schema does not match the schema that you have shown.  

(As an aside, I wonder why you are using such confusing table and column names 
that require the puking of quotes all over the place?  You can avoid such 
ugliness by searching and replacing all "-" with nothingness (do not embed 
unlawful characters in object names) and replacing the puking quotes with 
nothingness as they will no longer be required since you will no longer using 
unlawful characters in object names.  Reformatting the queries and table 
definitions to readable format is trivial, but you might want to do it yourself 
so you can see what you are doing at a cursory glance.  There are special 
places where "obfuscation" is treasured, production code is generally not one 
of them -- especially if someone other than yourself will be exposed to it or 
perhaps have to maintain it sometime in the future.)


Getting on with the business at hand, however:

ONE

You claim the entry-numbers are monotonically increasing.  Are they 
monotonically increasing ONLY WITHIN something else, or is the entire set 
monotonically increasing?  Your schema indicates that the entry-numbers are 
monotonically increasing only in combination with the logid (that is there are 
duplicate entry-numbers within each log) and that therefore the entrynumber is 
not actually an entrynumber but is something else entirely like an overloaded 
timestamp or such.

Please specify whether or not entry-numbers is a candidate key for your entrys 
table -- your prose indicate that it is yet your schema says it is not.

Which one is the correct?


TWO

Mutatis mutandis the itemid in the items and itemdigests table wherein the 
declaration of itemid is inconsistent.  Does it identify an item or does it not?

Is it a candidate key within the itemdigests table or not?

Please explain.


THREE 

What is the love of LEFT OUTER JOIN and why are you using it?  I know you may 
have heard of an outer join somewhere along the way, but can you please explain 
why you think the use of it is necessary in this case?


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Andy Bennett
>Sent: Tuesday, 22 January, 2019 06:09
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Query Planning Knowledge
>
>Hi,
>
>I'm having some problems understanding what the query planner is
>doing and
>how to convince it to work in the way I want.
>
>Sorry that this is such a long and involved eMail. I've tried to
>describe
>my problem and show the steps I have taken in debugging it. I have
>made
>some effort to present a minimal and basic illustration of the
>problem.
>Either I'm in error and glad to be assisted, or there's some
>opportunities
>that the query planner is not taking. Hopefully my detail helps
>someone who
>knows more than me get to the bottom of the issue!
>
>
>My GUI montior is reporting SQLite 3.8.11.1 and my app is using
>3.17.0. I
>get the same result with both.
>
>I've included more complete details of my schema at the bottom of
>this
>message but only introduced the bits I'm talking about as I go
>through.
>
>
>I'm aiming to write a query that's good for feeding an iterator and
>for
>pagination. i.e. I don't want a query that needs to compute the whole
>result up-front, before returning the first row. I want something
>where I
>can execute the query, consume a few rows for a while, close it and
>then
>later start a new query with a parameter based on how far I got
>through and
>then continue where I left off.
>
>The structure of my dataset is such that once rows are written to the
>database they are not changed, so there exist a set of parameters
>where I
>can do this.
>
>
>The data is an append only log. The log consists of entries (in the
>entrys
>table) and they have monotonically increasing "entry-number"s. The
>table
>can store more than one log so there's a "log-id" as well. These
>entries
>have some data of their own and each point to zero or more items
>which are
>stored in the "items" table. Thus there's a join table "entry-items"
>which
>consists of 3 columns, all parts of the primary key, "log-d",
>"entry-number" and "item-id".
>
>Each entry has a "key". As entries are added to the log, the "latest"
>entry
>for a particular key describes the current state of that key. The set
>of
>keys and their current state is called "records" and they only exist
>as a
>query on the tables. It is this query that I'm trying to plan well.
>
>

Re: [sqlite] Query Planning Knowledge

2019-01-22 Thread David Raymond
Can't go into as much detail as you. But a couple comments.

"primary key unique" is redundant, and will actually create a redundant unique 
index.

"Do later version of SQLite manage to spot the constant propagation 
opportunity?"
This requires way more algebra and proof than I think you realize. "entrys" is 
on the right side of a LEFT OUTER JOIN, and therefore may be null when it comes 
time to do the next OUTER JOIN to "entry-items", so a direct replacement of an 
"equals" constraint isn't possible. And after that it again starts becoming 
algebra as to whether that null can affect things etc.

For the ordering, I recommend seeing if you can replace one of those "entrys" 
indexes so that they start with "key" as the first field in the index. That 
would at least give it more opportunity to use that index for the ordering 
rather than needing to order things after they're all collected. That and 
explicitly stating the order by in _both_ the sub select and the final might 
make it notice "I can use this _ordered_ sub select as the outer table in the 
joins and get my overall ordering that way"... or it may not. Worth a try 
though.



-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Andy Bennett
Sent: Tuesday, January 22, 2019 8:09 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Query Planning Knowledge

Hi,

I'm having some problems understanding what the query planner is doing and 
how to convince it to work in the way I want.

Sorry that this is such a long and involved eMail. I've tried to describe 
my problem and show the steps I have taken in debugging it. I have made 
some effort to present a minimal and basic illustration of the problem. 
Either I'm in error and glad to be assisted, or there's some opportunities 
that the query planner is not taking. Hopefully my detail helps someone who 
knows more than me get to the bottom of the issue!


My GUI montior is reporting SQLite 3.8.11.1 and my app is using 3.17.0. I 
get the same result with both.

I've included more complete details of my schema at the bottom of this 
message but only introduced the bits I'm talking about as I go through.


I'm aiming to write a query that's good for feeding an iterator and for 
pagination. i.e. I don't want a query that needs to compute the whole 
result up-front, before returning the first row. I want something where I 
can execute the query, consume a few rows for a while, close it and then 
later start a new query with a parameter based on how far I got through and 
then continue where I left off.

The structure of my dataset is such that once rows are written to the 
database they are not changed, so there exist a set of parameters where I 
can do this.


The data is an append only log. The log consists of entries (in the entrys 
table) and they have monotonically increasing "entry-number"s. The table 
can store more than one log so there's a "log-id" as well. These entries 
have some data of their own and each point to zero or more items which are 
stored in the "items" table. Thus there's a join table "entry-items" which 
consists of 3 columns, all parts of the primary key, "log-d", 
"entry-number" and "item-id".

Each entry has a "key". As entries are added to the log, the "latest" entry 
for a particular key describes the current state of that key. The set of 
keys and their current state is called "records" and they only exist as a 
query on the tables. It is this query that I'm trying to plan well.

The idea is for the app user to be able to ask for a cursor on the records 
and then consume them a page at a time. I'm not particularly bothered which 
order they keys come out in but it has to be deterministic so that I can 
restart the query again. It's nice if they come out in "key" order but 
"entry-number" order will do too. The main property I want is to be able to 
pass this app-defined cursor over my API boundary without holding open a 
read transaction on SQLite.

I've started with a log that isn't huge but needs some attention to make it 
work efficiently. This log has 54,272 entries and 42,737 records at the 
latest version. I expect to have logs with a few tens of million active 
records but probably nothing larger than 100 million. Not exactly Big Data 
;-) but worthy of indexing.


So, here's my query, annotated with a few numbered points so that I can 
talk about things. The bits that are commented out get commented in (and 
out again!) as I discuss my problems with the query plan.

-
explain query plan
SELECT
"entrys"."log-id"   AS "log-id",
"entrys"."entry-number" AS "entry-number",
"entrys&quo

[sqlite] Query Planning Knowledge

2019-01-22 Thread Andy Bennett

Hi,

I'm having some problems understanding what the query planner is doing and 
how to convince it to work in the way I want.


Sorry that this is such a long and involved eMail. I've tried to describe 
my problem and show the steps I have taken in debugging it. I have made 
some effort to present a minimal and basic illustration of the problem. 
Either I'm in error and glad to be assisted, or there's some opportunities 
that the query planner is not taking. Hopefully my detail helps someone who 
knows more than me get to the bottom of the issue!



My GUI montior is reporting SQLite 3.8.11.1 and my app is using 3.17.0. I 
get the same result with both.


I've included more complete details of my schema at the bottom of this 
message but only introduced the bits I'm talking about as I go through.



I'm aiming to write a query that's good for feeding an iterator and for 
pagination. i.e. I don't want a query that needs to compute the whole 
result up-front, before returning the first row. I want something where I 
can execute the query, consume a few rows for a while, close it and then 
later start a new query with a parameter based on how far I got through and 
then continue where I left off.


The structure of my dataset is such that once rows are written to the 
database they are not changed, so there exist a set of parameters where I 
can do this.



The data is an append only log. The log consists of entries (in the entrys 
table) and they have monotonically increasing "entry-number"s. The table 
can store more than one log so there's a "log-id" as well. These entries 
have some data of their own and each point to zero or more items which are 
stored in the "items" table. Thus there's a join table "entry-items" which 
consists of 3 columns, all parts of the primary key, "log-d", 
"entry-number" and "item-id".


Each entry has a "key". As entries are added to the log, the "latest" entry 
for a particular key describes the current state of that key. The set of 
keys and their current state is called "records" and they only exist as a 
query on the tables. It is this query that I'm trying to plan well.


The idea is for the app user to be able to ask for a cursor on the records 
and then consume them a page at a time. I'm not particularly bothered which 
order they keys come out in but it has to be deterministic so that I can 
restart the query again. It's nice if they come out in "key" order but 
"entry-number" order will do too. The main property I want is to be able to 
pass this app-defined cursor over my API boundary without holding open a 
read transaction on SQLite.


I've started with a log that isn't huge but needs some attention to make it 
work efficiently. This log has 54,272 entries and 42,737 records at the 
latest version. I expect to have logs with a few tens of million active 
records but probably nothing larger than 100 million. Not exactly Big Data 
;-) but worthy of indexing.



So, here's my query, annotated with a few numbered points so that I can 
talk about things. The bits that are commented out get commented in (and 
out again!) as I discuss my problems with the query plan.


-
explain query plan
SELECT
"entrys"."log-id"   AS "log-id",
"entrys"."entry-number" AS "entry-number",
"entrys"."region"   AS "region",
"entrys"."key"  AS "key",
"entrys"."timestamp"AS "timestamp",
"entry-items"."item-id" AS "item-id",
"items"."blob"  AS "blob"

FROM
(SELECT -- ** 1 **
  MAX("entry-number") AS "entry-number",
  "key"
  FROM "entrys"
  WHERE
  "log-id" = 51 AND
  "region" = "user" AND
  "entry-number" <= 54272
-- AND key > "19" -- ** 2 **
-- AND key > "145943"
  GROUP BY "key"
-- order by key -- ** 3 **
) AS "specific-entrys"

LEFT OUTER JOIN "entrys"
ON
"specific-entrys"."entry-number" = "entrys"."entry-number"
AND "specific-entrys"."key" = "entrys"."key" -- ** 4 **
AND "user" = "entrys"."region"

LEFT OUTER JOIN "entry-items"
ON
51 = "entry-items"."log-id" AND -- ** 5 **
"specific-entrys"."entry-number" = "entry-items"."entry-number"

LEFT OUTER JOIN "items"
ON
"items"."item-id" = "entry-items"."item-id"

WHERE
"entrys"."log-id" = 51
  -- order by key -- ** 6 **
  ;
-

Here's the query plan for the query above:

-
1|0|0|SEARCH TABLE entrys USING COVERING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)

0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=? 
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items 

[sqlite] Query regression with virtual tables

2018-11-12 Thread Hick Gunter
On upgrading from 3.7.14 to 3.24 (at last) I found a very detrimental change in 
the query implementation for a certain type of query:

SELECT <...> FROM  WHERE a IN () AND b IN () AND 
c =  AND timestamp between  AND  ORDER BY timestamp 
DESC LIMIT ,;

In 3.7.14 the xBestIndex function was called with 3 constraints 
{(c,'='),(timestamp, '<='),(timestamp,'>=)}. The function sets argvIndex values 
and OMIT flags for all three constraints, an index number and the 
orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over virtual table
-- check field a value IN ephemeral table a
-- check field b value IN ehpemeral table b
-next

In 3.24 the xBestIndex fuction is called with two additional constraints 
{(a,'='),(b.'=')}. The function sets argvIndex values and OMIT flags for all 
five constraints, an index number and the orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over ephemeral table a
-- loop over ephemeral table b
--- loop over virtual table
- sort result set
- return LIMIT rows


The virtual table is actually a partitioned table that will search only the 
partitions that cover the selected range of timestamps and implements merge 
logic. It will therefore accept and pass on addtional constraints to the 
subquery against the partition members. Unfortunately, the NGQP seems to be 
asking about a join with ephemeral tables, which precludes using CROSS JOIN to 
force a performant query plan that returns correctly ordered result rows.

Of course I could rewrite this as a sequence of statements approximately like:

BEGIN;
CREATE TEMP TABLE a_values AS VALUES()...
CREATE TEMP TABLE b_values AS VALUES()...
SELECT <...> FROM  CROSS JOIN a_values ON (a) CROSS JOIN 
b_values ON (b) WHERE c =  AND timestamp between  AND  
ORDER BY timestamp DESC LIMIT ,;
DROP TABLE a_values;
DROP TABLE b_values;
COMMIT;

But this results in two nested loops (even if an index is added on teach temp 
table). How can I get the ephemeral table lookups back?


___
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query performance slower in 3.25

2018-09-30 Thread Stephen F. Booth
On Sat, Sep 29, 2018 at 5:33 PM Richard Hipp  wrote:

> On 9/29/18, Stephen F. Booth  wrote:
> > A query that ran fine under SQLite 3.24 is substantially slower in 3.25:
>
> Thanks for the data sent off-list
>
> Your work-around is to add a plus sign "+" before the "a.id" in the
> GROUP BY clause.  (And, BTW, shouldn't that really be an ORDER BY
> clause instead of a GROUP BY?)
>
> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts
> where
> a_fts match 'r*' order by rank) group by +a.id;
>
> The change is the single "+" near the end.  This should be work for
> you on all releases.
>

This does fix the issue. Thanks!

I'm sure there is a smarter way to write this query- you're probably right
about ORDER BY instead of GROUP BY.

Stephen


> This is an interesting query planner problem.  Recall that a query
> planner is really a kind of AI that has to infer or guess the best
> query algorithm based on incomplete information.  In this particular
> case, the AI is making a bad choice.  It will take some time for us to
> figure out why and perhaps come up with an improvement.
>
> Even in prior releases (such as 3.24.0) the AI was very very close to
> making a bad choice.  A single minor tweak in one of the weights
> pushed the decision threshold over a limit and caused that bad choice
> to be taken.  So the problem has been lurking just under the surface
> for a long time, apparently.  The minor tweak in 3.25.0
> (https://sqlite.org/src/info/85b9be) merely brought the problem to the
> surface.
>
> --
> D. Richard Hipp
> d...@sqlite.org
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query performance slower in 3.25

2018-09-29 Thread Richard Hipp
On 9/29/18, Stephen F. Booth  wrote:
> A query that ran fine under SQLite 3.24 is substantially slower in 3.25:

Thanks for the data sent off-list

Your work-around is to add a plus sign "+" before the "a.id" in the
GROUP BY clause.  (And, BTW, shouldn't that really be an ORDER BY
clause instead of a GROUP BY?)

select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by +a.id;

The change is the single "+" near the end.  This should be work for
you on all releases.

This is an interesting query planner problem.  Recall that a query
planner is really a kind of AI that has to infer or guess the best
query algorithm based on incomplete information.  In this particular
case, the AI is making a bad choice.  It will take some time for us to
figure out why and perhaps come up with an improvement.

Even in prior releases (such as 3.24.0) the AI was very very close to
making a bad choice.  A single minor tweak in one of the weights
pushed the decision threshold over a limit and caused that bad choice
to be taken.  So the problem has been lurking just under the surface
for a long time, apparently.  The minor tweak in 3.25.0
(https://sqlite.org/src/info/85b9be) merely brought the problem to the
surface.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query performance slower in 3.25

2018-09-29 Thread Richard Hipp
Could you please provide us with the database schema, or perhaps even
a short script that demonstrates your problem, so that we can try to
debug it?

On 9/29/18, Stephen F. Booth  wrote:
> A query that ran fine under SQLite 3.24 is substantially slower in 3.25:
>
> SQLite version 3.24.0 2018-06-04 19:24:41
>> .timer on
>> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
> a_fts match 'r*' order by rank) group by a.id;
> Run Time: real 0.037 user 0.019868 sys 0.016376
>
> SQLite version 3.25.0 2018-09-15 04:01:47
>> .timer on
>> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
> a_fts match 'r*' order by rank) group by a.id;
> Run Time: real 4.525 user 2.055779 sys 2.466143
>
> Performance in 3.25.1 and 3.25.2 is similar to 3.25.0.
>
> The bottleneck seems to be in the CASE portion of the query. I am not sure
> why.
>
> To try and make a valid comparison I compiled the versions identically
> using the following flags:
>
> % gcc shell.c sqlite3.c -DSQLITE_THREADSAFE=0 -DSQLITE_DEFAULT_MEMSTATUS=0
> -DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1 -DSQLITE_LIKE_DOESNT_MATCH_BLOBS
> -DSQLITE_MAX_EXPR_DEPTH=0 -DSQLITE_OMIT_DECLTYPE -DSQLITE_OMIT_DEPRECATED
> -DSQLITE_OMIT_PROGRESS_CALLBACK -DSQLITE_OMIT_SHARED_CACHE
> -DSQLITE_USE_ALLOCA=1 -DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_RTREE
> -DSQLITE_ENABLE_STAT4 -DHAVE_READLINE -DHAVE_USLEEP -lreadline -Os
>
> Here is the compiler info:
> Apple LLVM version 10.0.0 (clang-1000.11.45.2)
> Target: x86_64-apple-darwin18.2.0
> Thread model: posix
>
> What could be causing this performance hit in 3.25?
>
> Thanks,
> Stephen
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query performance slower in 3.25

2018-09-29 Thread Stephen F. Booth
A query that ran fine under SQLite 3.24 is substantially slower in 3.25:

SQLite version 3.24.0 2018-06-04 19:24:41
> .timer on
> select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by a.id;
Run Time: real 0.037 user 0.019868 sys 0.016376

SQLite version 3.25.0 2018-09-15 04:01:47
> .timer on
> select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by a.id;
Run Time: real 4.525 user 2.055779 sys 2.466143

Performance in 3.25.1 and 3.25.2 is similar to 3.25.0.

The bottleneck seems to be in the CASE portion of the query. I am not sure
why.

To try and make a valid comparison I compiled the versions identically
using the following flags:

% gcc shell.c sqlite3.c -DSQLITE_THREADSAFE=0 -DSQLITE_DEFAULT_MEMSTATUS=0
-DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1 -DSQLITE_LIKE_DOESNT_MATCH_BLOBS
-DSQLITE_MAX_EXPR_DEPTH=0 -DSQLITE_OMIT_DECLTYPE -DSQLITE_OMIT_DEPRECATED
-DSQLITE_OMIT_PROGRESS_CALLBACK -DSQLITE_OMIT_SHARED_CACHE
-DSQLITE_USE_ALLOCA=1 -DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_RTREE
-DSQLITE_ENABLE_STAT4 -DHAVE_READLINE -DHAVE_USLEEP -lreadline -Os

Here is the compiler info:
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin18.2.0
Thread model: posix

What could be causing this performance hit in 3.25?

Thanks,
Stephen
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query planner improvements in case of AUTOMATIC INDEX

2018-09-19 Thread Rob Golsteijn
Hi List,

When investigating performance of one of our queries I found an interesting 
situation that might be an opportunity for performance improvement.
Tested with Sqlite version 3.15.2 (November 2016).

Consider the following table and query

CREATE TABLE Node
(
    Id  INTEGER PRIMARY KEY AUTOINCREMENT,
    x   NUMBER(10),
    y   NUMBER(10), 
    HeightLevel NUMBER(2),
    GeomWGS84   BLOB,
    Perm_id TEXT
    /* some irrelevant fields removed */
);

/* find duplicates */
SELECT NOD1.PERM_ID,
   NOD1.X,
   NOD1.Y,
   NOD1.HeightLevel,
   NOD1.GeomWGS84
  FROM    Node NOD1
   INNER JOIN Node NOD2 ON NOD1.X   = NOD2.X
   AND NOD1.Y   = NOD2.Y 
   AND NOD1.HeightLevel = NOD2.HeightLevel
   AND NOD1.GeomWGS84   = NOD2.GeomWGS84
   AND NOD1.ID <> NOD2.ID
  ORDER BY NOD1.GeomWGS84;

The query plan of this query is
selectid|order|from|detail
0|0|0|SCAN TABLE Node AS NOD1
0|1|1|SEARCH TABLE Node AS NOD2 USING AUTOMATIC COVERING INDEX (GeomWGS84=? AND 
HeightLevel=? AND y=? AND x=?)
0|0|0|USE TEMP B-TREE FOR ORDER BY

It takes 636 seconds wall clock time to execute the query.

I would expect that this is the execution time of creating the index + the 
execution time of the query when the index is already present.
So if I create the "AUTOMATIC" index explicitly the expected total execution 
time would also be 636 seconds, but

CREATE INDEX idx_node on node (GeomWGS84, HeightLevel, y, x);
Takes 40 seconds and subsequent query execution takes 8 seconds. So 48 seconds 
in total compared to 636 second with the AUTOMATIC query.

The explanation can be found when looking at the query plan of the query (in 
the new schema with index present):

selectid|order|from|detail
0|0|0|SCAN TABLE DH_NOD AS NOD1 USING INDEX idx_node
0|1|1|SEARCH TABLE DH_NOD AS NOD2 USING COVERING INDEX idx_node (GeomWGS84=? 
AND HeightLevel=? AND y=? AND x=?)

So the explicit index is now also used for ORDER BY optimization.

I guess in general it could be used for other optimizations as well .


The optimization possibility is to re-evaluate the query plan, taking also the 
AUTOMATIC indexes into account, once Sqlite decided that AUTOMATIC indexes are 
useful. 
To avoid extra planning time, maybe this should only be done when AUTOMATICALLY 
INDEXED table(s) are used multiple times in the query (otherwise they will not 
change the query plan anyway)?
Since query planning is typically fast compared to query execution, the extra 
iteration of the query planner may be acceptable for the cases the query plan 
cannot be improved. For our company it would be acceptable but in general I 
cannot judge.

Regards,
Rob Golsteijn
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on TEMP view.

2018-08-28 Thread Keith Medcalf

There are a myriad of reasons for the behaviour you are seeing and they affect 
only performance and not correctness.  In other words, you think that your UDF 
is more "expensive" to compute than the PPID == 2 test, and therefore the least 
expensive test should be performed first so that the more expensive operation 
does not need to be performed where its result would merely be discarded by 
virtue of the lesser expensive ANDed condition.

The other thing is that the subquery is likely being flattened -- again this 
depends on the version of SQLite3 you are using.  However, assuming that the 
query is being flattened then:

SELECT COUNT(TEMP.ID) FROM (SELECT ID,NAME,PPID FROM AUDIO WHERE PPID=2) AS 
TEMP WHERE smart_search(NAME,ID)
which is exactly the same as (with the aliasing removed as it adds nothing of 
significance) 
SELECT COUNT(ID) FROM (SELECT ID, NAME, PPID FROM AUDIO WHERE PPID=2) WHERE 
smart_search(name,id)
which should be flattened to
SELECT COUNT(ID) FROM AUDIO WHERE PPID=2 and smart_search(name,id)

*See the query flattening rules at 
https://www.sqlite.org/optoverview.html#subquery_flattening

That said, however, I am unable to reproduce with the current tip of trunk.  

What version of sqlite3 are you using?


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Hegde, Deepakakumar
>(D.)
>Sent: Monday, 27 August, 2018 23:47
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Query on TEMP view.
>
>Hi All,
>
>
>I am facing a problem where in defined function registered to sqlite
>is called multiple time withput considering the filter.
>
>
>Ex:
>
>Table and entry:
>
>ID  NAME  PPID
>
>1a.mp3   2
>
>2b.mp3   3
>
>
>Query:
>
>SELECT COUNT(TEMP.ID) FROM (SELECT ID,NAME,PPID FROM AUDIO WHERE
>PPID=2) AS TEMP WHERE smart_search(NAME,ID)
>
>
>OutPut count = 1 (This is fine)
>
>
>Here smart_search() is defined and given to sqlite DB.
>
>We are expecting smart_search() to be called from sqlite only for the
>entry with PPID 2 and that will be for ID 1 a.mp3.
>
>But we see that this is called 2 times and for both the entry.
>
>
>As per my understanding. filter of PPID is added for the inner query
>smart_search() should have called only for one entry. Is this the
>expected behavior?
>
>
>If i change the query as below then the smart_search() is called for
>only one entry.
>
>SELECT COUNT(TEMP.ID) FROM (SELECT ID,NAME,PPID FROM AUDIO WHERE
>PPID=2) AS TEMP WHERE PPID=2 AND smart_search(NAME,ID)
>
>
>Please help to understand on this.
>
>
>Thanks and Regards
>
>Deepak
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query on TEMP view.

2018-08-27 Thread Hegde, Deepakakumar (D.)
Hi All,


I am facing a problem where in defined function registered to sqlite is called 
multiple time withput considering the filter.


Ex:

Table and entry:

ID  NAME  PPID

1a.mp3   2

2b.mp3   3


Query:

SELECT COUNT(TEMP.ID) FROM (SELECT ID,NAME,PPID FROM AUDIO WHERE PPID=2) AS 
TEMP WHERE smart_search(NAME,ID)


OutPut count = 1 (This is fine)


Here smart_search() is defined and given to sqlite DB.

We are expecting smart_search() to be called from sqlite only for the entry 
with PPID 2 and that will be for ID 1 a.mp3.

But we see that this is called 2 times and for both the entry.


As per my understanding. filter of PPID is added for the inner query 
smart_search() should have called only for one entry. Is this the expected 
behavior?


If i change the query as below then the smart_search() is called for only one 
entry.

SELECT COUNT(TEMP.ID) FROM (SELECT ID,NAME,PPID FROM AUDIO WHERE PPID=2) AS 
TEMP WHERE PPID=2 AND smart_search(NAME,ID)


Please help to understand on this.


Thanks and Regards

Deepak
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimisation

2018-08-24 Thread David Wellman
Hi Richard and David,

Many thanks for your responses, very useful.

I'll work with that (being careful to look at the OpCode documentation for my 
release of sqlite!) and see where I get to.

Cheers,
Dave


Ward Analytics Ltd - information in motion
Tel: +44 (0) 118 9740191
Fax: +44 (0) 118 9740192
www: http://www.ward-analytics.com

Registered office address: The Oriel, Sydenham Road, Guildford, Surrey, United 
Kingdom, GU1 3SR
Registered company number: 3917021 Registered in England and Wales.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Richard Hipp
Sent: 24 August 2018 15:43
To: SQLite mailing list
Subject: Re: [sqlite] Query optimisation

On 8/24/18, David Raymond  wrote:
> Running just "explain some query" will give you the virtual machine program
> that it plans on using. You can then scan through that to see what it's
> doing. Note that the descriptions on the below page for those op codes are
> sometimes really confusing and it can take a while to decypher what's going
> on.
>
> https://www.sqlite.org/opcode.html

To further confuse matters, the https://www.sqlite.org/opcode.html
page only describes the opcodes for the latest release (3.24.0)
whereas the OP is using an earlier release (3.20, I think).  Opcodes
and their meanings can change from one release to the next.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimisation

2018-08-24 Thread Richard Hipp
On 8/24/18, David Raymond  wrote:
> Running just "explain some query" will give you the virtual machine program
> that it plans on using. You can then scan through that to see what it's
> doing. Note that the descriptions on the below page for those op codes are
> sometimes really confusing and it can take a while to decypher what's going
> on.
>
> https://www.sqlite.org/opcode.html

To further confuse matters, the https://www.sqlite.org/opcode.html
page only describes the opcodes for the latest release (3.24.0)
whereas the OP is using an earlier release (3.20, I think).  Opcodes
and their meanings can change from one release to the next.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimisation

2018-08-24 Thread David Raymond
Running just "explain some query" will give you the virtual machine program 
that it plans on using. You can then scan through that to see what it's doing. 
Note that the descriptions on the below page for those op codes are sometimes 
really confusing and it can take a while to decypher what's going on.

https://www.sqlite.org/opcode.html


"(Assumption) Any row in the relation table that does NOT meet the WHERE clause 
is ignored."

Down below, lines 5 and 6 (and 24) are in the outer loop and basically say "if 
waStatsIDCount is not null then go to the next record", so correct.


"For each qualifying row in the relation table read from waSTATSINFO_VT using 
the PK index to try and find a match"

Line 14 shows that it's using waStatsInfo_VT (the main rowid/integer primary 
key table) for that part of the program. and Line 17 has it trying to find a 
matching rowid


"Where there is a match, update the relation table."

Even if there isn't a match. In that case it would get updated with null. All 
of the "whether or not to actually do the update" bits have already been looked 
at.


SQLite version 3.20.1 2017-08-24 16:21:36
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

sqlite> create table relation (relationKey integer primary key, waStatsIDCount, 
queryID);

sqlite> create table waStatsInfo_VT (relationKey integer primary key, 
waStatsIDCount);

sqlite> explain query plan update relation set waStatsIDCount = (select 
src.waStatsIDCount from waStatsInfo_VT as src where src.relationKey = 
relation.relationKey) where waStatsIDCount is null and queryID = 2;
selectid|order|from|detail
0|0|0|SCAN TABLE relation
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 0
0|0|0|SEARCH TABLE waStatsInfo_VT AS src USING INTEGER PRIMARY KEY (rowid=?)

sqlite> explain update relation set waStatsIDCount = (select src.waStatsIDCount 
from waStatsInfo_VT as src where src.relationKey = relation.relationKey) where 
waStatsIDCount is null and queryID = 2;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 26000  Start at 26
1 Null   0 1 200  r[1..2]=NULL
2 OpenWrite  0 2 0 3  00  root=2 iDb=0; relation
3 Explain0 0 0 SCAN TABLE relation  00
4 Rewind 0 25000
5   Column 0 1 600  
r[6]=relation.waStatsIDCount
6   NotNull6 24000  if r[6]!=NULL goto 
24
7   Column 0 2 700  
r[7]=relation.queryID
8   Ne 8 247 (BINARY)   51  if r[7]!=r[8] goto 
24
9   Rowid  0 2 000  r[2]=rowid
10  IsNull 2 25000  if r[2]==NULL goto 
25
11  Null   0 3 000  r[3]=NULL
12  Null   0 9 900  r[9..9]=NULL; Init 
subquery result
13  Integer1 10000  r[10]=1; LIMIT 
counter
14  OpenRead   1 3 0 2  00  root=3 iDb=0; 
waStatsInfo_VT
15  Explain0 0 0 SEARCH TABLE waStatsInfo_VT AS src 
USING INTEGER PRIMARY KEY (rowid=?)  00
16  Rowid  0 11000  r[11]=rowid
17  SeekRowid  1 2011   00  intkey=r[11]; pk
18  Column 1 1 900  
r[9]=waStatsInfo_VT.waStatsIDCount
19  DecrJumpZero   1020000  if (--r[10])==0 
goto 20
20  SCopy  9 4 000  r[4]=r[9]
21  Column 0 2 500  
r[5]=relation.queryID
22  MakeRecord 3 3 7 D  00  r[7]=mkrec(r[3..5])
23  Insert 0 7 2 relation   07  intkey=r[2] 
data=r[7]
24Next   0 5 001
25Halt   0 0 000
26Transaction0 1 2 0  01  usesStmtJournal=0
27Integer2 8 000  r[8]=2
28Goto   0 1 000

sqlite>



-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of David Wellman
Sent: Friday, August 24, 2018 7:47 AM
To: SQLite Users
Subject: [sqlite] Query optimisation

HI all,

 

I would like to use the following example as a learning exercise for myself
to check my understanding of part of sqlite processing.

 

I have the following query which functionally works fine, and to be upfront
about it t

[sqlite] Query optimisation

2018-08-24 Thread David Wellman
HI all,

 

I would like to use the following example as a learning exercise for myself
to check my understanding of part of sqlite processing.

 

I have the following query which functionally works fine, and to be upfront
about it the volume of data is so small that performance is not an issue.

 

The query:

update relation

   set wastatsidcount = (select src.wastatsidcount

  from waSTATSINFO_VT as src

  where src.relationkey = relation.relationkey)

where wastatsidcount is null

  and queryid = 2;

 

The plan:

If I run 'explain query plan' on this command it gives the following:

SelectedID  OrderFrom Detail

0  0  0  SCAN TABLE
relation

0  0  0  EXECUTE
CORRELATED SCALAR SUBQUERY 0

0  0  0  SEARCH TABLE
waSTATSINFO_VT AS src USING INTEGER PRIMARY KEY (rowid=?)

 

My 'src' table is defined with a PRIMARY KEY on column RELATIONKEY. 

The same column is also the primary key on the 'relation' table.

 

My understanding of this plan is:

-  Read the relation table using a full table scan

-  (Assumption) Any row in the relation table that does NOT meet the
WHERE clause is ignored.

-  For each qualifying row in the relation table read from
waSTATSINFO_VT using the PK index to try and find a match

-  Where there is a match, update the relation table.

 

Questions:

-  Is my assumption above ("Any row in the relation table that does
NOT meet the WHERE clause is ignored") correct?

-  Is there any form of 'explain' or other diagnostic output which
would show me this?
I tried the 'explain query plan' with and without the full WHERE clause and
it didn't change the output (I didn't understand the output from the plain
'explain' command!)

 

In this particular example I need the WHERE clause as coded in order to give
me the correct answer, but as I said at the start I'm trying to deepen my
knowledge of SQLite performance and it's optimiser.

 

I'm currently using v3.20.1.

 

Cheers,

Dave

 

 

Ward Analytics Ltd - information in motion

Tel: +44 (0) 118 9740191

Fax: +44 (0) 118 9740192

www: http://www.ward-analytics.com

 

Registered office address: The Oriel, Sydenham Road, Guildford, Surrey,
United Kingdom, GU1 3SR

Registered company number: 3917021 Registered in England and Wales.

 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite query planner doesn't seem to know when a partial index is COVERING

2018-06-15 Thread David Raymond
I've seen similar. I think the issue is that since a isn't an indexed column 
then it's not technically covering, despite being given a in the where clause.


sqlite> create index Foo_partial_with_a on Foo (a, b, c) where a = 1;

sqlite> analyze;

sqlite> explain query plan select b from Foo where a = 1 and b = 2 order by c;
selectid|order|from|detail
0|0|0|SEARCH TABLE Foo USING COVERING INDEX Foo_partial_with_a (a=? AND b=?)


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Deon Brewis
Sent: Friday, June 15, 2018 1:09 PM
To: SQLite mailing list
Subject: [sqlite] SQLite query planner doesn't seem to know when a partial 
index is COVERING

Looks like a missed optimization opportunity here. Well, more than that - there 
doesn't appear to be a way to get SQLITE to automatically use a partial index 
if a similar non-partial index exists.

E.g.
create table Foo(a,b,c,d,e);
create index Foo_inx on Foo(a,b,c);
create index Foo_partial_inx on Foo(b,c) where a = 1;

insert into Foo(a,b,c,d,e) values(1,2,3,4,5);
insert into Foo(a,b,c,d,e) values(1,2,2,4,6);
insert into Foo(a,b,c,d,e) values(1,2,5,1,1);
insert into Foo(a,b,c,d,e) values(2,1,6,1,1);
insert into Foo(a,b,c,d,e) values(6,4,6,1,1);

analyze Foo;
explain query plan select b from Foo where a=1 and b=2 order by c;
> SEARCH TABLE Foo USING COVERING INDEX Foo_inx (a=? AND b=?)


In this case SQLite picks Foo_inx over Foo_partial_inx. I suspect it's because 
it doesn't know Foo_partial_inx is effectively a covering index because if I 
force the index by hand, it doesn't list it as a COVERING index:

explain query plan select b from Foo indexed by Foo_partial_inx where a=1 and 
b=2 order by c;
> SEARCH TABLE Foo USING INDEX Foo_partial_inx (b=?)

And I suspect that's why it picks Foo_inx over Foo_partial_inx.  But otherwise 
this behavior seems to be exactly what I want though (will need to step through 
an 'explain' to make sure it doesn't do main table lookups), but it requires an 
INDEXED BY to get there.


As a workaround, if I repeat the WHERE clause field ('a') in the partial index 
field list, THEN it starts using the partial index automatically:

create index Foo_partial_inx2 on Foo(a,b,c) where a = 1;
analyze Foo;
> SEARCH TABLE Foo USING COVERING INDEX Foo_partial_inx2 (a=? AND b=?)

But that has 2 problems:
a) this makes the index bigger due to the extra (very unnecessary) column
b) the executer doesn't seem to take into the account that this is a partial 
index so it searches for 'a' (minor issue)

Of course the partial index still has less rows than without partial so it's 
still a win, but still - it shouldn't need 'a' to be repeated on every row. 
Either way, though there are issues with the workaround, the bigger issue is 
that it doesn't automatically pick the original Foo_partial_inx in the first 
place.

- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite query planner doesn't seem to know when a partial index is COVERING

2018-06-15 Thread Deon Brewis
Looks like a missed optimization opportunity here. Well, more than that - there 
doesn't appear to be a way to get SQLITE to automatically use a partial index 
if a similar non-partial index exists.

E.g.
create table Foo(a,b,c,d,e);
create index Foo_inx on Foo(a,b,c);
create index Foo_partial_inx on Foo(b,c) where a = 1;

insert into Foo(a,b,c,d,e) values(1,2,3,4,5);
insert into Foo(a,b,c,d,e) values(1,2,2,4,6);
insert into Foo(a,b,c,d,e) values(1,2,5,1,1);
insert into Foo(a,b,c,d,e) values(2,1,6,1,1);
insert into Foo(a,b,c,d,e) values(6,4,6,1,1);

analyze Foo;
explain query plan select b from Foo where a=1 and b=2 order by c;
> SEARCH TABLE Foo USING COVERING INDEX Foo_inx (a=? AND b=?)


In this case SQLite picks Foo_inx over Foo_partial_inx. I suspect it's because 
it doesn't know Foo_partial_inx is effectively a covering index because if I 
force the index by hand, it doesn't list it as a COVERING index:

explain query plan select b from Foo indexed by Foo_partial_inx where a=1 and 
b=2 order by c;
> SEARCH TABLE Foo USING INDEX Foo_partial_inx (b=?)

And I suspect that's why it picks Foo_inx over Foo_partial_inx.  But otherwise 
this behavior seems to be exactly what I want though (will need to step through 
an 'explain' to make sure it doesn't do main table lookups), but it requires an 
INDEXED BY to get there.


As a workaround, if I repeat the WHERE clause field ('a') in the partial index 
field list, THEN it starts using the partial index automatically:

create index Foo_partial_inx2 on Foo(a,b,c) where a = 1;
analyze Foo;
> SEARCH TABLE Foo USING COVERING INDEX Foo_partial_inx2 (a=? AND b=?)

But that has 2 problems:
a) this makes the index bigger due to the extra (very unnecessary) column
b) the executer doesn't seem to take into the account that this is a partial 
index so it searches for 'a' (minor issue)

Of course the partial index still has less rows than without partial so it's 
still a win, but still - it shouldn't need 'a' to be repeated on every row. 
Either way, though there are issues with the workaround, the bigger issue is 
that it doesn't automatically pick the original Foo_partial_inx in the first 
place.

- Deon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on SQLite - VxWorks

2018-06-11 Thread Simon Slavin
On 9 Jun 2018, at 6:36pm, Guna Sekar  wrote:

> Is SQLite supports outfile query and dumps all data into specified file 
> format ? 

I don't understand your question, but you might want to use the command-line 
shell program to dump a database to a text file -- either as SQL commands or a 
CSV file.  Please see this page:



and download the command-line shell program from this page:



Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query on SQLite - VxWorks

2018-06-11 Thread Guna Sekar
Hi Team,

 Is SQLite supports outfile query and dumps all data into specified file format 
? 

I have tried to extract data from table using outfile query but query failed 
status returned. If supports , Can you please share the syntax or example that 
will really help lot to me.

Awaiting for your valuable response.


Regards,
Gunasekar K

Sent from my iPhone
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] query planning - covering indices

2018-05-16 Thread Richard Hipp
On 5/15/18, Peter Johnson  wrote:
>
> My understanding is that covering indices are more efficient, as the table
> itself does not need to be scanned when all the required columns exist in
> the covering index?

That is often the case, but there are exceptions.

>
> Is it correct to say that example 1 is more efficient than the other two
> examples, particularly when there are many columns, all of which are
> covered by the covering index?

No.

>
> *example 1 - using covering index*
>
> CREATE TABLE example (a INTEGER, b INTEGER, c INTEGER);
> CREATE INDEX idx_covering ON example( a, b, c );
> EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;
>
> 0|0|0|SEARCH TABLE example USING COVERING INDEX idx_covering (a=? AND b=?)
>
> *example2 - using primary key & covering index*
>
> CREATE TABLE example (a INTEGER PRIMARY KEY, b INTEGER, c INTEGER);
> CREATE INDEX idx_covering ON example( a, b, c );
> EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;
>
> 0|0|0|SEARCH TABLE example USING INTEGER PRIMARY KEY (rowid=?)

An INTEGER PRIMARY KEY *is* a covering index, even though the EXPLAIN
QUERY PLAN output does not identify it as such.

>
> *example3 - using composite primary key & covering index*
>
> CREATE TABLE example (a INTEGER, b INTEGER, c INTEGER, PRIMARY KEY(a, b));
> CREATE INDEX idx_covering ON example( a, b, c );
> EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;
>
> 0|0|0|SEARCH TABLE example USING INDEX sqlite_autoindex_example_1 (a=? AND
> b=?)

In this case, you have faked out the query planner.  The query planner
prefers to use the index created to implement the PRIMARY KEY since it
is a UNIQUE index that will only return a single row.  The covering
index, on the other hand, might return multiple rows as far as the
query planner knows.  The query planner fails to deduce that the first
two columns of the idx_covering index will be unique due to the
existence of the first index.

A better approach here would be to make the PRIMARY KEY a true PRIMARY
KEY, and not just an alias for a UNIQUE constraint, but adding WITHOUT
ROWID to the end of the CREATE TABLE statement.  The PRIMARY KEY on a
WITHOUT ROWID table is always a covering index (even though the
EXPLAIN QUERY PLAN output does not say so) so you will always end up
using a covering index.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] query planning - covering indices

2018-05-16 Thread Peter Johnson
I noticed that the query planner favours the primary index when a covering
index exists which can satisfy the same query.

My understanding is that covering indices are more efficient, as the table
itself does not need to be scanned when all the required columns exist in
the covering index?

Is it correct to say that example 1 is more efficient than the other two
examples, particularly when there are many columns, all of which are
covered by the covering index?

*example 1 - using covering index*

CREATE TABLE example (a INTEGER, b INTEGER, c INTEGER);
CREATE INDEX idx_covering ON example( a, b, c );
EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;

0|0|0|SEARCH TABLE example USING COVERING INDEX idx_covering (a=? AND b=?)

*example2 - using primary key & covering index*

CREATE TABLE example (a INTEGER PRIMARY KEY, b INTEGER, c INTEGER);
CREATE INDEX idx_covering ON example( a, b, c );
EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;

0|0|0|SEARCH TABLE example USING INTEGER PRIMARY KEY (rowid=?)

*example3 - using composite primary key & covering index*

CREATE TABLE example (a INTEGER, b INTEGER, c INTEGER, PRIMARY KEY(a, b));
CREATE INDEX idx_covering ON example( a, b, c );
EXPLAIN QUERY PLAN SELECT a, b, c FROM example WHERE a = 1 AND b = 1;

0|0|0|SEARCH TABLE example USING INDEX sqlite_autoindex_example_1 (a=? AND
b=?)

I also noticed that when using > or < instead of = that the covering index
is used instead of the primary index.

Could someone please help me to understand why it works like this?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list

2018-04-28 Thread Balaji Ramanathan
SELECT COUNT(*) FROM TABLE WHERE NAME <= (SELECT NAME FROM TABLE WHERE ID =
3 ORDER BY NAME ASC)

Balaji Ramanathan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread x
How about



SELECT ID,NAME,

(SELECT COUNT(*) FROM TABLE WHERE NAME<=(SELECT NAME FROM TABLE WHERE ID=d.ID)) 
as Position

FROM TABLE d

ORDER BY ID;



sqlite> create table t(ID,name text);

sqlite> insert into t values (1,'AAA'),(2,'ZZZ'),(3,'BBB'),(4,'WWW'),(5,'CCC');

sqlite> select ID,name,(select count(*) from t where name<=(select name from t 
where ID=d.ID)) as Posn from t d

order by ID;

1|AAA|1

2|ZZZ|5

3|BBB|2

4|WWW|4

5|CCC|3

sqlite>




From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on behalf of 
Hegde, Deepakakumar (D.) <deep...@allgosystems.com>
Sent: Friday, April 27, 2018 3:51:27 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Sqlite query to get the offset of an entry in the list.

Hi All,


We have a requirement where in offset of the primary key ID is needed as per 
the sorted list.


Table:


ID   NAME

1  AAA

2  ZZZ

3  BBB

4  WWW

5  CCC


Now need to get the offset of the ID 3 in the sorted list of the NAME.


SELECT * FROM TABLE ORDER BY NAME ASC


1   AAA

3   BBB

5   CCC

4   WWW

2   ZZZ


So position of ID 3 as per the sorted list of the NAME is 2.


currently we are getting the entry with the select statement and by comparing 
the ID externally we are getting the offset.


Is there any optimal way to get this information directly with one single query?


Thanks and Regards

Deepak

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread Keith Medcalf

Again, requiring that both "id" and "name" are candidate keys.  In which case, 
since there has to be unique indexes to enforce that, one might use the more 
straightforward:

select count(*) from table where name <= (select name from table where id=?) 
order by name;


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of x
>Sent: Friday, 27 April, 2018 09:43
>To: SQLite mailing list
>Subject: Re: [sqlite] Sqlite query to get the offset of an entry in
>the list.
>
>SELECT COUNT(*)+1 FROM TABLE WHERE NAME < (SELECT NAME FROM TABLE
>WHERE ID = 3)
>
>
>
>(I think)
>
>
>
>
>From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on
>behalf of Hegde, Deepakakumar (D.) <deep...@allgosystems.com>
>Sent: Friday, April 27, 2018 3:51:27 PM
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Sqlite query to get the offset of an entry in the
>list.
>
>Hi All,
>
>
>We have a requirement where in offset of the primary key ID is needed
>as per the sorted list.
>
>
>Table:
>
>
>ID   NAME
>
>1  AAA
>
>2  ZZZ
>
>3  BBB
>
>4  WWW
>
>5  CCC
>
>
>Now need to get the offset of the ID 3 in the sorted list of the
>NAME.
>
>
>SELECT * FROM TABLE ORDER BY NAME ASC
>
>
>1   AAA
>
>3   BBB
>
>5   CCC
>
>4   WWW
>
>2   ZZZ
>
>
>So position of ID 3 as per the sorted list of the NAME is 2.
>
>
>currently we are getting the entry with the select statement and by
>comparing the ID externally we are getting the offset.
>
>
>Is there any optimal way to get this information directly with one
>single query?
>
>
>Thanks and Regards
>
>Deepak
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread Keith Medcalf

The constraint, obviously, being that "id" and "name" are each candidate keys 
...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.


>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Simon Davies
>Sent: Friday, 27 April, 2018 09:35
>To: SQLite mailing list
>Subject: Re: [sqlite] Sqlite query to get the offset of an entry in
>the list.
>
>On 27 April 2018 at 15:51, Hegde, Deepakakumar (D.)
><deep...@allgosystems.com> wrote:
>> Hi All,
>>
>> We have a requirement where in offset of the primary key ID is
>needed as per the sorted list.
>.
>.
>.
>> 1   AAA
>> 3   BBB
>> 5   CCC
>> 4   WWW
>> 2   ZZZ
>>
>> So position of ID 3 as per the sorted list of the NAME is 2.
>
>sqlite> create table t( id integer primary key, data text );
>sqlite> insert into t( data ) values('aaa'), ('zzz'), ('bbb'),
>('www'), ('ccc');
>sqlite>
>sqlite> select count(*)+1 from t where data<(select data from t where
>id=3);
>2
>
>> Thanks and Regards
>> Deepak
>
>Rgds,
>Simon
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread Keith Medcalf

create table data (id integer primary key, name text);
insert into data (name) values ('AAA'), ('ZZZ'), ('BBB'), ('WWW'), ('CCC');

select * from data;
1|AAA
2|ZZZ
3|BBB
4|WWW
5|CCC

select * from data order by name;
1|AAA
3|BBB
5|CCC
4|WWW
2|ZZZ

create table temp.ranked as select * from data order by name;

select rowid, * from temp.ranked;
1|1|AAA
2|3|BBB
3|5|CCC
4|4|WWW
5|2|ZZZ

select rowid from temp.ranked where id = 3;
2

drop table temp.ranked;


There is likely a way to do this using a recursive CTE without a temp table, 
however, I cannot do that off the top of my mind immediately as there are too 
many possible constraints.  Someone else may have already thought about how to 
do that.  There are lots of solutions with various "constraints" and 
"assumptions" about the data though. This one happens to not require any such 
assumptions or constraints ... If you, for example, constrained the name column 
to be unique, then there exists a much simpler solution.

Whether or not the sequence:

begin immediate;
drop table if exists temp.ranked;
create table temp.ranked as select * from data order by name;
select rowid from temp.ranked where id = 3;
drop table if exists temp.ranked;
rollback;

constitutes a "single sql statement" depends on how you are interfacing with 
SQLite.  For me, it is a single statement returning a single row.  YMMV.


However, my question would be why you need to know the offset of the ID in the 
sorted set of results as that seems like a "navigational" problem rather than a 
"relational" problem?  Are the "requirements" perhaps geared to file or 
hierachical database rather than a relational database?  (It is quite common 
for "navigational" problems to be stated in requirements for "relational" 
implementations.  This is why over the years there have been many functions 
added to the "monster" relational engines -- because it is easier to add a 
"MakeCoffee()" function than it is to argue that a relational database should 
not have a "MakeCoffee" function.  The epitome of this is of course Oracle, 
which has a function for everything you could ever possibly want to do.)


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Hegde, Deepakakumar
>(D.)
>Sent: Friday, 27 April, 2018 08:51
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] Sqlite query to get the offset of an entry in the
>list.
>
>Hi All,
>
>
>We have a requirement where in offset of the primary key ID is needed
>as per the sorted list.
>
>
>Table:
>
>
>ID   NAME
>
>1  AAA
>
>2  ZZZ
>
>3  BBB
>
>4  WWW
>
>5  CCC
>
>
>Now need to get the offset of the ID 3 in the sorted list of the
>NAME.
>
>
>SELECT * FROM TABLE ORDER BY NAME ASC
>
>
>1   AAA
>
>3   BBB
>
>5   CCC
>
>4   WWW
>
>2   ZZZ
>
>
>So position of ID 3 as per the sorted list of the NAME is 2.
>
>
>currently we are getting the entry with the select statement and by
>comparing the ID externally we are getting the offset.
>
>
>Is there any optimal way to get this information directly with one
>single query?
>
>
>Thanks and Regards
>
>Deepak
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread x
SELECT COUNT(*)+1 FROM TABLE WHERE NAME < (SELECT NAME FROM TABLE WHERE ID = 3)



(I think)




From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> on behalf of 
Hegde, Deepakakumar (D.) <deep...@allgosystems.com>
Sent: Friday, April 27, 2018 3:51:27 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Sqlite query to get the offset of an entry in the list.

Hi All,


We have a requirement where in offset of the primary key ID is needed as per 
the sorted list.


Table:


ID   NAME

1  AAA

2  ZZZ

3  BBB

4  WWW

5  CCC


Now need to get the offset of the ID 3 in the sorted list of the NAME.


SELECT * FROM TABLE ORDER BY NAME ASC


1   AAA

3   BBB

5   CCC

4   WWW

2   ZZZ


So position of ID 3 as per the sorted list of the NAME is 2.


currently we are getting the entry with the select statement and by comparing 
the ID externally we are getting the offset.


Is there any optimal way to get this information directly with one single query?


Thanks and Regards

Deepak

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread Simon Davies
On 27 April 2018 at 15:51, Hegde, Deepakakumar (D.)
 wrote:
> Hi All,
>
> We have a requirement where in offset of the primary key ID is needed as per 
> the sorted list.
.
.
.
> 1   AAA
> 3   BBB
> 5   CCC
> 4   WWW
> 2   ZZZ
>
> So position of ID 3 as per the sorted list of the NAME is 2.

sqlite> create table t( id integer primary key, data text );
sqlite> insert into t( data ) values('aaa'), ('zzz'), ('bbb'), ('www'), ('ccc');
sqlite>
sqlite> select count(*)+1 from t where data<(select data from t where id=3);
2

> Thanks and Regards
> Deepak

Rgds,
Simon
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sqlite query to get the offset of an entry in the list.

2018-04-27 Thread Hegde, Deepakakumar (D.)
Hi All,


We have a requirement where in offset of the primary key ID is needed as per 
the sorted list.


Table:


ID   NAME

1  AAA

2  ZZZ

3  BBB

4  WWW

5  CCC


Now need to get the offset of the ID 3 in the sorted list of the NAME.


SELECT * FROM TABLE ORDER BY NAME ASC


1   AAA

3   BBB

5   CCC

4   WWW

2   ZZZ


So position of ID 3 as per the sorted list of the NAME is 2.


currently we are getting the entry with the select statement and by comparing 
the ID externally we are getting the offset.


Is there any optimal way to get this information directly with one single query?


Thanks and Regards

Deepak

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query regarding CVE-2018-8740

2018-04-26 Thread Abroży Nieprzełoży
I think sqlite-autoconf-3230100 should be OK.
http://www.sqlite.org/2018/sqlite-autoconf-3230100.tar.gz

Always check download page for the newest version
http://www.sqlite.org/download.html


2018-04-27 1:11 GMT+02:00, salil GK:
> Hello
>
>We are using sqlite-autoconf-322 in our product. Recently there
> was a CVE released for sqlite - CVE-2018-8740 - for which patch is
> available in
> https://www.sqlite.org/cgi/src/vdiff?from=1774f1c3baf0bc3d=d75e67654aa9620b.
> But this patch is for sqlite code I suppose. The patch mentioned above
> is not applicable for sqlite-autoconf.
>Is there any new version of sqlite-autoconf available which is
> compliant to CVE-2018-8740
>
> Thanks in advance
> ~S
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query regarding CVE-2018-8740

2018-04-26 Thread salil GK
Hello

   We are using sqlite-autoconf-322 in our product. Recently there
was a CVE released for sqlite - CVE-2018-8740 - for which patch is
available in 
https://www.sqlite.org/cgi/src/vdiff?from=1774f1c3baf0bc3d=d75e67654aa9620b.
But this patch is for sqlite code I suppose. The patch mentioned above
is not applicable for sqlite-autoconf.
   Is there any new version of sqlite-autoconf available which is
compliant to CVE-2018-8740

Thanks in advance
~S
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query speed Regression: 3.15 much faster than 3.22 (Was: How to optimise a somewhat-recursive query? )

2018-03-18 Thread Richard Hipp
On 3/18/18, Jonathan Moules  wrote:
>
> I can provide a small replication database if desired.

Please do.  Send it as an attachment directly to me.  Also please send
the exact text of the query that is running slowly.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query speed Regression: 3.15 much faster than 3.22 (Was: How to optimise a somewhat-recursive query? )

2018-03-18 Thread Jonathan Moules

Hi List,
So, I thought I'd solved my little problem, but upon testing it in my 
application it subjectively didn't seem any faster.


I upgraded the SQLite in my IDE to 3.22, and it is confirming my 
suspicions. It turns out that when I run the below in 3.22, it takes 
about 0.150s. But in 3.15 it was taking 0.004s!


The original query also takes 0.15s in 3.22 - so that has been mildly 
optimised (from ~0.2s). My general thinking-aloud notion is that my 
"fix" is getting optimised away in 3.22.


I can provide a small replication database if desired.

Thanks,
Jonathan

On 2018-03-19 00:24, Jonathan Moules wrote:

Thanks Simon and Quan.
I'm not sure it's the view itself per-se - It takes 0.000s (time too 
small to measure) for just the full View to be run on this dataset.


It turns out the problem is simpler than that and no data changes are 
needed. I did consider Quan Yong Zhai's option and gave it a try, but 
couldn't fathom out the necessary SQL to get what I wanted (it's 
getting late).


Instead prompted by the replies here, I've changed the query very 
slightly to the below which solves the problem:


SELECT
u.url, l.error_code
FROM
urls u
LEFT JOIN
lookups l
USING(url_id)
LEFT JOIN
(select * from v_most_recent_lookup_per_url where url_id in (
select url_id from urls where url = 'example.com'
)) recent
-- By definition url's can) recent
-- This is the important bit
-- Here we use the most recent lookup url_id to link to the 
source_seed_id, so we only find its children

-- Or alternatively itself
ON u.source_seed_id = recent.url_id
OR u.url_id = recent.url_id
WHERE
-- For JSON-spider at least, Generic's are guaranteed to be 
generic pages.

l.is_generic_flag = 1
AND
-- Must be "or equal to" so we can get the lookup of the very 
base url.

l.retrieval_datetime >= recent.retrieval_datetime
AND
DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
ORDER BY
u.url_id DESC
LIMIT 1;


To save readers having to compare manually, the difference is this: I 
turned the "recent" alias item from the View into a subquery (still 
using the view), and then moved the "where url = example.com" part in 
to there.
The query is now literally two orders of magnitude faster, from 0.2s 
to 0.004s. No new indexes or anything, just that change.


Hopefully this will scale to full datasets; if it doesn't I may have 
to consider the other suggestions, but for now this is a 
minimum-effort solution.


I'm not actually sure what SQLite was doing in the previous query to 
make it take so long. , so I imagine there was some hideous recursing 
going on or something.


Scope for optimisation?

Thanks again,
Jonathan

On 2018-03-18 23:37, Simon Slavin wrote:
On 18 Mar 2018, at 11:13pm, Jonathan Moules 
 wrote:


Given there's such a small amount of data at this point, I suspect 
the issue is more related to the recursion. I've tried creating 
these two indexes to facilicate that

Nice idea but I can see why it's not working.

You have an underlying problem: the format you're using to store your 
data makes it extremely difficult to extract the figures you want.  
Quan Yong Zhai has the best idea I've seen: get rid of almost ¾ of 
the work you're doing by storing the last retrieval date in your 
"urls" table.


As an alternative, store the start your retrieval process by JOINing 
the two tables together.  Consider the result of this query


SELECT url_id, lookups.error_code
 FROM urls
 JOIN lookups ON lookups.url_id = urls.url_id AND 
lookup.retrieval_datetime = urls.retrieval_datetime
 WHERE DATETIME(urls.retrieval_datetime) > DATETIME('now', '-5 
days')


and figure out what you would add to that to get your desired result.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users







___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimizer and recursive common table expressions

2018-01-03 Thread Cezary H. Noweta

Hello,

On 2018-01-04 01:53, R Smith wrote:
Not to mention that if you wait several years, depending on your 
processor/compiler, the integer 64 value might wrap around and x<=3 
might become true once more, producing rows again  :)
Unfortunately, it will be stuck when int becomes double (at 
9223372036854775808 -- still much time :-).


``We are programmers and responsible for programming. -O99 is 
responsible for thinking. Who in the hell implemented -O when there had 
not been -O?'' :-)


-- best regards

Cezary H. Noweta
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimizer and recursive common table expressions

2018-01-03 Thread R Smith

On 2018/01/04 12:36 AM, Richard Hipp wrote:

On 1/3/18, Shane Dev  wrote:

sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
select * from cnt where x <= 3;
[no sqlite> prompt, CPU utilization 25%]

I assume sqlite is recursively adding rows to the queue without considering
that the subsequent SELECT only needs the first 3 of them.

No, it is continuing to search for rows for which x<=3.  The query
planner does not know enough algebra to figure out that that will
never happen again after the first three rows.


Not to mention that if you wait several years, depending on your 
processor/compiler, the integer 64 value might wrap around and x<=3 
might become true once more, producing rows again  :)



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimizer and recursive common table expressions

2018-01-03 Thread Richard Hipp
On 1/3/18, Shane Dev  wrote:
>
> sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
> select * from cnt where x <= 3;
> [no sqlite> prompt, CPU utilization 25%]
>
> I assume sqlite is recursively adding rows to the queue without considering
> that the subsequent SELECT only needs the first 3 of them.

No, it is continuing to search for rows for which x<=3.  The query
planner does not know enough algebra to figure out that that will
never happen again after the first three rows.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query optimizer and recursive common table expressions

2018-01-03 Thread Shane Dev
I have just spotted a couple of typos in my email below. The first two
common table expressions should have been as follows -

with recursive cnt(x) as (select 1 union all select x+1 from cnt limit 3)
select * from cnt;
with recursive cnt(x) as (select 1 union all select x+1 from cnt) select *
from cnt limit 3;

On 3 January 2018 at 23:24, Shane Dev  wrote:

> Hi,
>
> This simple recursive common table expression returns all integers from 1
> to 3 as expected -
>
> sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt
> limit 3) select * from cnt where x;
> x
> 1
> 2
> 3
> sqlite>
>
> If the LIMIT constraint is moved from the compound SELECT to the
> subsequent SELECT, it works the same -
>
> sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
> select * from cnt where x limit 3;
> x
> 1
> 2
> 3
> sqlite>
>
> If the LIMIT constraint is replaced with a WHERE constraint in the
> compound SELECT, it still works the same -
>
> sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt
> where x < 3) select * from cnt;
> x
> 1
> 2
> 3
> sqlite>
>
> However if the WHERE constraint is moved from the compound SELECT to the
> subsequent SELECT and adjusted slightly, it selects correct results but
> then hangs indefinitely -
>
> sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
> select * from cnt where x <= 3;
> x
> 1
> 2
> 3
> [no sqlite> prompt, CPU utilization 25%]
>
> I assume sqlite is recursively adding rows to the queue without
> considering that the subsequent SELECT only needs the first 3 of them.
>
> Can we conclude the query planner is unable to optimize the compound
> SELECT (the part in brackets) based on the WHERE constraint of the
> subsequent SELECT statement?
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query optimizer and recursive common table expressions

2018-01-03 Thread Shane Dev
Hi,

This simple recursive common table expression returns all integers from 1
to 3 as expected -

sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt
limit 3) select * from cnt where x;
x
1
2
3
sqlite>

If the LIMIT constraint is moved from the compound SELECT to the subsequent
SELECT, it works the same -

sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
select * from cnt where x limit 3;
x
1
2
3
sqlite>

If the LIMIT constraint is replaced with a WHERE constraint in the compound
SELECT, it still works the same -

sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt
where x < 3) select * from cnt;
x
1
2
3
sqlite>

However if the WHERE constraint is moved from the compound SELECT to the
subsequent SELECT and adjusted slightly, it selects correct results but
then hangs indefinitely -

sqlite> with recursive cnt(x) as (select 1 union all select x+1 from cnt)
select * from cnt where x <= 3;
x
1
2
3
[no sqlite> prompt, CPU utilization 25%]

I assume sqlite is recursively adding rows to the queue without considering
that the subsequent SELECT only needs the first 3 of them.

Can we conclude the query planner is unable to optimize the compound
SELECT (the part in brackets) based on the WHERE constraint of the
subsequent SELECT statement?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-07-13 Thread mohan_gn
Hi Simon,

Thank you for your suggestion. Will check it out.

Regards,
Mohan



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Query-on-bug-fix-for-ticket-7ffd1ca1d2ad4ec-tp96175p96608.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-07-10 Thread Simon Slavin


On 10 Jul 2017, at 9:17am, mohan_gn  wrote:

> Since our product is already in field with older
> version of SQLite, we want to know this bug's impact. We use simple queries
> like 'select * from abc where col1=123 and col2=456'. We don't use any
> JOINs, VIEWs, nested queries. The tables we use do have primary keys and
> foreign keys. 
> So I wanted to know whether in our cases the bug will impact and whether we
> may be affected.

You do not say which version of SQLite your product uses.  If it’s earlier than 
3.7.7 or later than 3.16.2 then it is not affected.

Temporary indexes are made when there is no index ideally suited to your ORDER 
BY and WHERE clauses.  If your queries execute quickly because you have made 
the right indexes then this bug will not affect you.  See section 12.0 of



You can execute each of your SELECT/INSERT/UPDATE/DELETE commands with EXPLAIN 
QUERY PLAN in front of it, and check the output to see if it mentions a 
temporary index:



Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-07-10 Thread mohan_gn
Thank you for the reply. 
I know it has been fixed. Since our product is already in field with older
version of SQLite, we want to know this bug's impact. We use simple queries
like 'select * from abc where col1=123 and col2=456'. We don't use any
JOINs, VIEWs, nested queries. The tables we use do have primary keys and
foreign keys. 
So I wanted to know whether in our cases the bug will impact and whether we
may be affected. 

Thank you,
Mohan




--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Query-on-bug-fix-for-ticket-7ffd1ca1d2ad4ec-tp96175p96577.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-07-10 Thread Simon Slavin


On 10 Jul 2017, at 7:15am, mohan_gn  wrote:

> Could anyone please answer above query? 

Do you mean this bug ?



If so, the bug was fixed in January.  Please download the latest version of 
SQLite and see whether you can reproduce the bug.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-07-10 Thread mohan_gn
Hi All,

Could anyone please answer above query? 

Thank you very much,
Mohan



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Query-on-bug-fix-for-ticket-7ffd1ca1d2ad4ec-tp96175p96575.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query on bug fix for ticket 7ffd1ca1d2ad4ec

2017-06-13 Thread Mohandas G Nayak
Hi All,

We are using SQLite 3.7.16.1 in our product which has been released to market 
sometime back. 
We have been analysing releases of SQLite from 3.7.16 to 3.19.2 to see if we 
need to upgrade the SQLite 
for the software that is already deployed in the field. For this purpose we use 
release history of 
SQLite, go through the bug fixes done and check if we need to upgrade SQLite 
depending on the bug fix 
done.
While doing this analysis we came across this fix: "Use the correct affinity 
for columns of automatic 
indexes. Ticket 
7ffd1ca1d2ad4ec(https://www.sqlite.org/src/info/7ffd1ca1d2ad4ec)." which was 
fixed in 
3.17.0 The description of the bug fix is not detailed and hence we don't know 
if we need to upgrade 
SQLite version.
We use simple queries like 'select * from abc where col1=123 and col2=456'. We 
dont use any JOINs, 
VIEWs, nested queries. The tables we use do have primary keys and foreign keys. 
Please let us know in 
which cases the bug will impact and whether we may be affected.

Thank you,
Mohan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query plan gone haywire lays waste to my library's performance

2017-04-28 Thread Keith Medcalf
On Friday, 28 April, 2017 15:55, Simon Slavin  wrote:

> > The only difference is the explicit JOIN statement. I was under the
> > impression that using this, vs. the way I wrote it, is a matter of taste
> > that doesn’t affect the execution of the query.
 
> SQLite computes two-table searches using nested loops.  Providing the JOIN
> you want tells SQLite which order you think the loops should be nested in.
> Expressing the searches with JOINs is also helps me figure out what SQLite
> is actually doing.  Before I did that it wasn’t obvious to me how little
> the table "docs" mattered for this query.

NO IT DOES NOT.

FROM  JOIN < ON 

IS EXACTLY THE SAME AS

FROM ,  WHERE 

THERE IS NOT DIFFERENCE WHATSOEVER.  EVER.  PERIOD.

It is also exactly the same as 

FROM  JOIN  WHERE 
FROM  JOIN  WHERE 
FROM  JOIN  ON 
FROM ,  WHERE 

The *ONLY TIME EVER THAT "ON " is not IDENTICAL to leaving out the 
JOIN keyword and putting the  in the WHERE clause is the case of an 
OUTER (LEFT) JOIN.  In all other cases they are just different spellings of 
exactly the same thing.  The other somewhat limited exception is the CROSS JOIN 
which specifies the visitation order where the LHS table is the outer loop and 
the RHS table is the inner loop, however, in that case the ON clause is 
identical to a WHERE clause and is nothing more than alternate spelling (just 
like we spell colour properly here, but the Yangs spell it color).





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


  1   2   3   4   5   6   7   8   9   10   >