Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-18 Thread niklas
Thanks to all who have replied, very informative! :)

This is just a database for own personal use so it's not a big deal in any
way, mainly trying to get a better understanding of how Sqlite works here.

I'll note that the sql queries are not static inside my application but they
are generated dynamically from command line arguments. 
Basically any column can be added to the SELECT, WHERE and ORDER BY clauses
at runtime, and some columns will be generated from sub-queries (via JOIN:s)
as shown in first post.

As the correlated sub-queries will be executed twice if used in the WHERE
clause it seems that using JOIN:s is preferable for my use cases.

Some further experimentation shows that using LEFT JOIN for the subqueries
instead of INNER JOIN will always make the query planner do the "right"
thing, i.e. use automatic indexes instead of table scans. Regardless of
ANALYZE information being present or not. 

So that is maybe a better work-around than removing the ANALYZE tables. LEFT
JOIN and INNER JOIN will always return the same results in this case as all
books will (or should) have dates, genres and authors, and if they do not
then I probably want LEFT JOIN semantics anyway to better notice it. I
currently use INNER JOIN to give the query planner more freedom in selecting
query plans. 

Still would be interesting to know why Sqlite went for plain table scans in
the initial case. Seems that using automatic indexes will always be faster
(N*logN vs N*N), so why not always use them when possible? Acccording to the
documentation Sqlite assumes N is a million without analyze information, and
in that case it opted to generate automatic indexes. In my case with ANALYZE
information present N will be around 3000, and then it opted for table
scans. The final query took over 24 minutes with all 3 sub-query columns
present when run to completion though, so obviously the wrong choice since
the loops ended up being nested three or more times.

(I understand that the query planner must take many different scenarios into
account and generate good plans for all of them, so this is most likely an
unfortunate edge case.)

Also noticed another case where Sqlite uses nested table scanning for JOIN:s
and this time it was not instead of automatic indexes, apparently it opted
for nested scans to avoid using a temp b-tree in the ORDER BY. (This is part
of co-routine for a window function using the AuthorID for partitions.)

Slow nested table scan (execution time measured in seconds):

|  |  |--SCAN TABLE Authors
|  |  |--SCAN TABLE DatesRead
|  |  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=? AND
AuthorID=?)
|  |  |--SEARCH SUBQUERY 1 ...
|  |  `--SEARCH SUBQUERY 2 ...

vs temp b-tree (execution time measured in milliseconds):

|  |  |--SCAN TABLE AuthorBooks
|  |  |--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|  |  |--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|  |  |--SEARCH SUBQUERY 1 ...
|  |  |--SEARCH SUBQUERY 2 ...
|  |  `--USE TEMP B-TREE FOR ORDER BY

This is part of a larger query but I have not included all tables used in it
so just show parts that differ here, everything else in the two queries is
identical.
Dropping the ANALYZE information makes the query planner select the faster
alternative here as well.
I can provide more information about the query in case anyone is interested,
just included these parts now to illustrate the "problematic" nested scans.

(Not really that problematic, this just came up in a test that iterated over
all supported columns for all main queries, in actual use of the application
I would 
hardly run it, but still an interesting case I think.)




--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-17 Thread niklas
I agree that correlated subqueries in general seem more natural and are
probably also less likely to have the performance pessimizations noticed
with joins.

But I might also want to use the column, or in case of a correlated
subquery, the column alias, in the WHERE clause and previously that has not
been supported as I recall. Maybe also not allowed by SQL specification
itself?

I modified the suggested query and used the column alias in WHERE now
though, and it seemed to work! 

Is this a recent change in Sqlite or have I misunderstood something? The
Sqlite documentation still does not seem to say that column aliases can be
used in the WHERE clause at least.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-16 Thread niklas
The data used for sqlite_stat1 in create.txt is taken from the real data,
it's copied from the sql-dump generated just after running ANALYZE.

I only wanted to include the minimum amount of data the demonstrate the
issue so I omitted all other tables, views and data. As I understand it
sqlite only checks the statN table data when planning the queries and not
the actual data in the real tables.

I have dropped the sqlite_stat1 table for now and so disabled analyze
functionality. This way I avoid the slow queries. Running ANALYZE on the
database again will immediately make them return though.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query planner: Scanning subqueries vs using automatic covering index

2019-03-15 Thread niklas
I recently noticed some very slow queries for my sqlite book database. 
Turns out that it happened due to the query planner decided to scan 
sub-queries instead of using an automatic covering index to search them. 

The database contains about 3000 entries and with two subqueries it 
took a noticable time to run, with three subqueries I had to give up and 
abort to save the CPU from overheating. 

Using indexes only take milliseconds in all cases.

Dropping analyze (sqlite_stat1) information from the database made 
the problem go away, i.e. made sqlite use index instead of scanning. 
As did adding a few more columns to the ORDER BY part.

These sql statements should be enough to demonstrate the problem. 
I am using sqlite 3.25.1

**
create.txt
**

BEGIN;

CREATE TABLE "Authors" (
"AuthorID" INTEGER PRIMARY KEY,
"Last Name" TEXT NOT NULL,
"First Name" TEXT NOT NULL);

CREATE TABLE "Books" (
"BookID" INTEGER PRIMARY KEY,
"Title" TEXT NOT NULL);
 
CREATE TABLE "Genres"(
"GenreID" INTEGER PRIMARY KEY,
"Genre" TEXT UNIQUE NOT NULL);

CREATE TABLE "DatesRead"(
"BookID" INTEGER,
"Date Read" TEXT,
PRIMARY KEY("BookID", "Date Read"),
FOREIGN KEY(BookID) REFERENCES Books(BookID)) WITHOUT ROWID;

CREATE TABLE "AuthorBooks"(
"BookID" INTEGER,
"AuthorID" INTEGER,
PRIMARY KEY("BookID", "AuthorID" ),
FOREIGN KEY(BookID) REFERENCES Books(BookID),
FOREIGN KEY(AuthorID) REFERENCES Authors(AuthorID)) WITHOUT ROWID;

CREATE TABLE "BookGenres"(
"BookID" INTEGER,
"GenreID" INTEGER,  
PRIMARY KEY("BookID", "GenreID" ),
FOREIGN KEY(BookID) REFERENCES Books(BookID),
FOREIGN KEY(GenreID) REFERENCES Genres(GenreID)) WITHOUT ROWID;

ANALYZE;
INSERT INTO sqlite_stat1 VALUES('BookGenres','BookGenres','3190 2 1');
INSERT INTO sqlite_stat1 VALUES('DatesRead','DatesRead','3047 2 1');
INSERT INTO sqlite_stat1 VALUES('AuthorBooks','AuthorBooks','3549 2 1');
INSERT INTO sqlite_stat1 VALUES('Authors',NULL,'1329');
INSERT INTO sqlite_stat1 VALUES('Books',NULL,'2978');
INSERT INTO sqlite_stat1 VALUES('Genres','sqlite_autoindex_Genres_1','112
1');

COMMIT;


queries.txt


select "";
select "order by bookid only";
select "";

EXPLAIN QUERY PLAN
SELECT BookID,Title,"Author(s)","Date(s)","Genre(s)"
FROM Books
JOIN (SELECT BookID, group_concat(ltrim("First Name"||' '||"Last Name"),',
') AS 'Author(s)' FROM Books JOIN AuthorBooks USING(BookID) JOIN Authors
USING(AuthorID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat("Date read",', ') AS 'Date(s)' FROM Books
JOIN DatesRead USING(BookID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat(Genre,', ') AS 'Genre(s)' FROM Books JOIN
BookGenres USING(BookID) JOIN Genres USING(GenreID) GROUP BY BookID)
USING(BookID)
ORDER BY BookID;

select "---";
select "order by bookid and some other fields too";
select "--";

EXPLAIN QUERY PLAN
SELECT BookID,Title,"Author(s)","Date(s)","Genre(s)"
FROM Books
JOIN (SELECT BookID, group_concat(ltrim("First Name"||' '||"Last Name"),',
') AS 'Author(s)' FROM Books JOIN AuthorBooks USING(BookID) JOIN Authors
USING(AuthorID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat("Date read",', ') AS 'Date(s)' FROM Books
JOIN DatesRead USING(BookID) GROUP BY BookID) USING(BookID)
JOIN (SELECT BookID, group_concat(Genre,', ') AS 'Genre(s)' FROM Books JOIN
BookGenres USING(BookID) JOIN Genres USING(GenreID) GROUP BY BookID)
USING(BookID)
ORDER BY BookID, Title, "Author(s)", "Date(s)";



Test run:


del test.db
type create.txt | sqlite3 test.db
type queries.txt | sqlite3 test.db



Output:



order by bookid only

QUERY PLAN
|--MATERIALIZE 1
|  |--SCAN TABLE Books
|  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|--MATERIALIZE 2
|  |--SCAN TABLE Books
|  `--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|--MATERIALIZE 3
|  |--SCAN TABLE Books
|  |--SEARCH TABLE BookGenres USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Genres USING INTEGER PRIMARY KEY (rowid=?)
|--SCAN SUBQUERY 1
|--SCAN SUBQUERY 2
|--SCAN SUBQUERY 3
|--SEARCH TABLE Books USING INTEGER PRIMARY KEY (rowid=?)
`--USE TEMP B-TREE FOR ORDER BY
---
order by bookid and some other fields too
--
QUERY PLAN
|--MATERIALIZE 1
|  |--SCAN TABLE Books
|  |--SEARCH TABLE AuthorBooks USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Authors USING INTEGER PRIMARY KEY (rowid=?)
|--MATERIALIZE 2
|  |--SCAN TABLE Books
|  `--SEARCH TABLE DatesRead USING PRIMARY KEY (BookID=?)
|--MATERIALIZE 3
|  |--SCAN TABLE Books
|  |--SEARCH TABLE BookGenres USING PRIMARY KEY (BookID=?)
|  `--SEARCH TABLE Genres USING INTEGER 

Re: [sqlite] Maximum result set size

2019-03-12 Thread niklas
A clarification about sqlite3_exec please.

Surely that works the same way as sqlite3_step in respect to memory use
since it's using step internally and just forwards the results to the
callback function of exec.

Or did I miss something?



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Troubles with sqlile sql

2017-01-29 Thread David Niklas
I'm going to reply to both messages at once.

On  Sat, 28 Jan 2017 15:44:07 -0500
Richard Hipp  wrote:
> On 1/28/17, David Niklas  wrote:
> >
> > # ALTER TABLE processors ADD CONSTRAINT bit NOT NULL
> > Error: near "CONSTRAINT": syntax error
> >
> > # ALTER TABLE processors DROP bit;
> > Error: near "DROP": syntax error  
> 
> 
> ALTER TABLE is one of the few areas where SQLite's SQL support is
> thin.  SQLite only supports ALTER TABLE ADD COLUMN and ALTER TABLE
> RENAME TABLE.

Ah.

> >
> > I expected that both of the above would add the constraint.
> >
> > # SELECT boards.*,processors.*,storage.emmc FROM processors INNER  
> >> JOIN storage ON processors.board = storage.board;  
> > Error: no such table: boards
> >  
> 
> All tables used by the query must be named in the FROM clause.  What
> database engine are you used to that does not require this?
I used postgresql, but I learned sql from wikibooks* .

Namely this syntax:
SELECT[ ALL| DISTINCT] [[ AS] ][,[ ALL| DISTINCT]
[[ AS] ]]* FROM [[ AS] |[[ FULL| LEFT|
RIGHT] OUTER| INNER] JOIN  ON ] 


[, [[ AS]
|[[ FULL| LEFT| RIGHT] OUTER| INNER] JOIN  ON
]]*


On Sat, 28 Jan 2017 20:44:30 +
Simon Slavin  wrote:
> On 28 Jan 2017, at 7:27pm, David Niklas  wrote:
> 
> > # ALTER TABLE processors ADD CONSTRAINT bit NOT NULL
> > Error: near "CONSTRAINT": syntax error
> > 
> > # ALTER TABLE processors DROP bit;
> > Error: near "DROP": syntax error
> > 
> > I expected that both of the above would add the constraint.  
> 
> Neither adding nor deleting a constraint are supported by ALTER in
> SQLite.  If you didn’t create the constraint when you originally made
> the table you can’t do it.
> 
> Generally, to get this effect in SQLite, rename the original table,
> create a new table with the constraint, and copy the data across using
> something like
> 
> INSERT INTO processors (SELECT * FROM processors_old)
Thanks!

> > # SELECT boards.*,processors.*,storage.emmc FROM processors INNER   
> >> JOIN storage ON processors.board = storage.board;  
> > Error: no such table: boards
> > 
> > # .schema
> > CREATE TABLE boards(
> > board varchar(30) PRIMARY KEY NOT NULL,
> > price SMALLINT NOT NULL,
> > vendor varchar(27) NOT NULL,
> > oses varchar(32));
> > ...
> > 
> > I expected that the table boards would exist.  
> 
> You didn’t mention the table 'boards' in the list of tables in the
> JOIN.  It’s neither the main column of the SELECT (processors) nor the
> table you’ve joined to it (storage).
> 
> > I also can't figure out how to join 3 tables, each on the board
> > column.  
> 
> My initial thought is that you would have JOIN clauses in the same
> SELECT.
> 
> > # SELECT processors.*,storage.* FROM processors INNER JOIN storage
> > ON   
> >> processors.board = storage.board WHERE processors.board LIKE
> >> "%Rose%";  
> > Roseapple Pi Actions S500 4x A9 @ 1.6GHz PowerVR SGX544 256MB0
> > Roseapple Pi 4GB eMMC 0 0 0 0
> > 
> > I wanted the board section of both tables to be joined and the one of
> > them displayed (they have to be identical to join right?), not to
> > be repeated, one after the other.  
> 
> Not sure I’ve grasped what you want here, but you might want to JOIN
> with the UNION of two tables.  Or for your SELECT to to a UNION of two
> tables with JOINs.

I've not heard of a UNION (It seems that the wikibooks article is
incomplete* ). I'll have to look that up.
I expected this:
Roseapple Pi Actions S500 4x A9 @ 1.6GHz PowerVR SGX544 256MB0
4GB eMMC 0 0 0 0
Instead of:
Roseapple Pi Actions S500 4x A9 @ 1.6GHz PowerVR SGX544 256MB0
Roseapple Pi 4GB eMMC 0 0 0 0

* See:
https://en.wikibooks.org/wiki/Structured_Query_Language/Classic_Track_Print

Thank you,
David
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Troubles with sqlile sql

2017-01-28 Thread David Niklas
Good evening gentlemen,
I read the post:
http://linuxgizmos.com/ringing-in-2017-with-90-hacker-friendly-single-board-computers/
which lists a number of SBC.
I was interested, but the info in the table was lacking. The article also
left out much of what I considered "Interesting" information. It seems to
be more like a windowz PC sales ad.
So, I decided to follow all the links and get the specs on the boards
more thoroughly, and place them into an sqlite database, for
searchability.
Now, I'm no sql expert, but I have tried to learn and get this stuff
correct, so please bear with my inadequacy.

# ALTER TABLE processors ADD CONSTRAINT bit NOT NULL
Error: near "CONSTRAINT": syntax error

# ALTER TABLE processors DROP bit;
Error: near "DROP": syntax error

I expected that both of the above would add the constraint.

# SELECT boards.*,processors.*,storage.emmc FROM processors INNER 
> JOIN storage ON processors.board = storage.board;
Error: no such table: boards

# .schema
CREATE TABLE boards(
board varchar(30) PRIMARY KEY NOT NULL,
price SMALLINT NOT NULL,
vendor varchar(27) NOT NULL,
oses varchar(32));
...

I expected that the table boards would exist.
I also can't figure out how to join 3 tables, each on the board column.

# SELECT processors.*,storage.* FROM processors INNER JOIN storage ON 
> processors.board = storage.board WHERE processors.board LIKE "%Rose%";
Roseapple Pi Actions S500 4x A9 @ 1.6GHz PowerVR SGX544 256MB0
Roseapple Pi 4GB eMMC 0 0 0 0

I wanted the board section of both tables to be joined and the one of
them displayed (they have to be identical to join right?), not to
be repeated, one after the other.

Thank you in advance,
David
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE 3.7.3 bug report (shell) - outp ut in column mode does not align UTF8-strings correctl y

2010-11-25 Thread Niklas Bäckman
Igor Tandetnik  writes:

> 
> Niklas Bäckman  wrote:
> > Columns with special characters like ("å" "ä" "å") get too short widths when
> > output.
> > 
> > I guess this is due to the shell not counting actual UTF8 *characters/code
> > points* when calculating the widths, but instead only
> > counting the plain bytes in the strings, so they will seem longer until they
> > are actually printed to the console.
> 
> Note that counting codepoints, while it happens to help with your particular
data, won't help in general.
> Consider combining diacritics: U+00E4 (small A with diaeresis) looks the same
as U+0061 U+0308 (small
> letter A + combining diaeresis) when printed on the console.

You are right of course. The shell should not count code points, but graphemes.

http://unicode.org/faq/char_combmark.html#7

I guess that this probably falls out of the "lite" scope of SQLITE though? Like
how it does not support case-insensitive comparison of non-ASCII characters.

Or would it be possible to write such a graphemelen(s) function in not too many
lines of C code without needing any external Unicode libraries? 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLITE 3.7.3 bug report (shell) - output in column mode does not align UTF8-strings correctly

2010-11-24 Thread Niklas Bäckman
There's a problem aligning columns containing UTF-8 (non-ASCII) characters
correctly when output is done in column mode.

The .width was set as ".width 5 30 15 15 10 30 10" and here's an example of
the incorrect output:

(Maybe paste into editor with fixed width font to better see the error)

===

BookI  Title   Last NameFirst Name
Date Read   Original Title  Story Titl
-  --  ---  ---
--  --  --
20 Bakom fasaden   Sandemo  Margit   0
20 Bakom fasaden   Sandemo  Margit
2010-07-02
13 Blodshämnd Sandemo  Margit   0
13 Blodshämnd Sandemo  Margit
2010-06-19
10 Bödelns dotter Sandemo  Margit   0
10 Bödelns dotter Sandemo  Margit
2010-06-11
24 Demonen och jungfrunSandemo  Margit   0
24 Demonen och jungfrunSandemo  Margit
2010-07-08
43 Demonernas fjäll   Sandemo  Margit   0
43 Demonernas fjäll   Sandemo  Margit
2010-07-25

===

Columns with special characters like ("å" "ä" "å") get too short widths when
output.

I guess this is due to the shell not counting actual UTF8 *characters/code
points* when calculating the widths, but instead only
counting the plain bytes in the strings, so they will seem longer until they
are actually printed to the console.

The output was done to a Windows console with code page set to 65001.

-- 
Niklas
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users