Can't go into as much detail as you. But a couple comments.

"primary key unique" is redundant, and will actually create a redundant unique 
index.

"Do later version of SQLite manage to spot the constant propagation 
opportunity?"
This requires way more algebra and proof than I think you realize. "entrys" is 
on the right side of a LEFT OUTER JOIN, and therefore may be null when it comes 
time to do the next OUTER JOIN to "entry-items", so a direct replacement of an 
"equals" constraint isn't possible. And after that it again starts becoming 
algebra as to whether that null can affect things etc.

For the ordering, I recommend seeing if you can replace one of those "entrys" 
indexes so that they start with "key" as the first field in the index. That 
would at least give it more opportunity to use that index for the ordering 
rather than needing to order things after they're all collected. That and 
explicitly stating the order by in _both_ the sub select and the final might 
make it notice "I can use this _ordered_ sub select as the outer table in the 
joins and get my overall ordering that way"... or it may not. Worth a try 
though.



-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Andy Bennett
Sent: Tuesday, January 22, 2019 8:09 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Query Planning Knowledge

Hi,

I'm having some problems understanding what the query planner is doing and 
how to convince it to work in the way I want.

Sorry that this is such a long and involved eMail. I've tried to describe 
my problem and show the steps I have taken in debugging it. I have made 
some effort to present a minimal and basic illustration of the problem. 
Either I'm in error and glad to be assisted, or there's some opportunities 
that the query planner is not taking. Hopefully my detail helps someone who 
knows more than me get to the bottom of the issue!


My GUI montior is reporting SQLite 3.8.11.1 and my app is using 3.17.0. I 
get the same result with both.

I've included more complete details of my schema at the bottom of this 
message but only introduced the bits I'm talking about as I go through.


I'm aiming to write a query that's good for feeding an iterator and for 
pagination. i.e. I don't want a query that needs to compute the whole 
result up-front, before returning the first row. I want something where I 
can execute the query, consume a few rows for a while, close it and then 
later start a new query with a parameter based on how far I got through and 
then continue where I left off.

The structure of my dataset is such that once rows are written to the 
database they are not changed, so there exist a set of parameters where I 
can do this.


The data is an append only log. The log consists of entries (in the entrys 
table) and they have monotonically increasing "entry-number"s. The table 
can store more than one log so there's a "log-id" as well. These entries 
have some data of their own and each point to zero or more items which are 
stored in the "items" table. Thus there's a join table "entry-items" which 
consists of 3 columns, all parts of the primary key, "log-d", 
"entry-number" and "item-id".

Each entry has a "key". As entries are added to the log, the "latest" entry 
for a particular key describes the current state of that key. The set of 
keys and their current state is called "records" and they only exist as a 
query on the tables. It is this query that I'm trying to plan well.

The idea is for the app user to be able to ask for a cursor on the records 
and then consume them a page at a time. I'm not particularly bothered which 
order they keys come out in but it has to be deterministic so that I can 
restart the query again. It's nice if they come out in "key" order but 
"entry-number" order will do too. The main property I want is to be able to 
pass this app-defined cursor over my API boundary without holding open a 
read transaction on SQLite.

I've started with a log that isn't huge but needs some attention to make it 
work efficiently. This log has 54,272 entries and 42,737 records at the 
latest version. I expect to have logs with a few tens of million active 
records but probably nothing larger than 100 million. Not exactly Big Data 
;-) but worthy of indexing.


So, here's my query, annotated with a few numbered points so that I can 
talk about things. The bits that are commented out get commented in (and 
out again!) as I discuss my problems with the query plan.

-----
explain query plan
            SELECT
            "entrys"."log-id"       AS "log-id",
            "entrys"."entry-number" AS "entry-number",
            "entrys"."region"       AS "region",
            "entrys"."key"          AS "key",
            "entrys"."timestamp"    AS "timestamp",
            "entry-items"."item-id" AS "item-id",
            "items"."blob"          AS "blob"

            FROM
            (SELECT -- ** 1 **
              MAX("entry-number") AS "entry-number",
              "key"
              FROM "entrys"
              WHERE
              "log-id" = 51 AND
              "region" = "user" AND
              "entry-number" <= 54272
-- AND key > "100009" -- ** 2 **
-- AND key > "145943"
              GROUP BY "key"
                 -- order by key -- ** 3 **
                 ) AS "specific-entrys"

            LEFT OUTER JOIN "entrys"
            ON
            "specific-entrys"."entry-number" = "entrys"."entry-number"
AND "specific-entrys"."key" = "entrys"."key" -- ** 4 **
AND "user" = "entrys"."region"

            LEFT OUTER JOIN "entry-items"
            ON
            51 = "entry-items"."log-id" AND -- ** 5 **
            "specific-entrys"."entry-number" = "entry-items"."entry-number"

            LEFT OUTER JOIN "items"
            ON
            "items"."item-id" = "entry-items"."item-id"

            WHERE
            "entrys"."log-id" = 51
               -- order by key -- ** 6 **
               ;
-----

Here's the query plan for the query above:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=? 
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX 
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----


The query uses a subselect (** 1 **) to get the latest entry number for 
each and every key (i.e. to get the list of records). On my test workload 
this produces 42,737 rows and takes 49ms. The efficiency of this query is 
not a problem yet: If it has to precompute all the keys up-front then I'm 
fine with that for now.

The query then joins the result back onto the "entrys" table to get the 
rest of the per-entry metadata, onto the "entry-items" join table and then 
onto the "items" table to get one row per entry per item. In my app I 
coallesce these rows back together again.

The (non-explain version) query takes 679ms to run and produces 42,737 rows 
because none of the entries in this particular log have more than one item 
each.

Originally the two lines at (** 5 **) said

-----
"entrys"."log-id" = "entry-items"."log-id" AND
"entrys"."entry-number" = "entry-items"."entry-number"
-----

instead of

-----
51 = "entry-items"."log-id" AND -- ** 5 **
"specific-entrys"."entry-number" = "entry-items"."entry-number"
-----

As far as I can tell these are semantically equivalent but the original 
version takes ~170 seconds rather than 679ms and the query plan looks like 
this:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=? 
AND entry-number=?)
0|2|2|SCAN TABLE entry-items
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----

i.e. it does a SCAN TABLE in the loop rather than using the COVERING INDEX.

Do later version of SQLite manage to spot the constant propagation 
opportunity?


Adding the lines at (** 4 **) allow the planner to use another index.



The query seems to produce rows ordered by "key". If I want to be able to 
consume a few rows from the result set and then restart later (perhaps by 
using something like the clauses at (** 2 **)) then I need to gurantee this 
property.
Adding the WHERE clause at (** 6 **) does this but it changes the query 
plan thus:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX 
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=? 
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX 
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|USE TEMP B-TREE FOR ORDER BY
-----

i.e. now we have a "USE TEMP B-TREE FOR ORDER BY" which (I imagine!) causes 
the whole result set to be generated up front before the first row is 
returned.

Now, without the order by clause in (** 2 **), we're getting the rows out 
in the correct order as a side effect of the group by clause in (** 1 **) 
so let's uncomment the order by clause at (** 3 **) instead.


This gets us back to the original, iterative query plan. It's got a 
subquery that we scan in (sorted) order and then some inner loops which 
should preserve the ordering of the result set, even if the inner loops 
produce more rows.

However, we still want to gurantee this property so I uncommented the order 
by clause at (** 6 **) again. Once again, the "USE TEMP B-TREE FOR ORDER 
BY" line appears in the query plan.

I think I need this order by clause so that I get the correct behaviour in 
my app, even if later version of the engine change the query plan. Perhaps 
I lose my performance properties but at least I still get the correct 
answer!


How do I get SQLite to recognise that it's already done enough work in with 
this plan to get the results out in the order I specified and therefore 
elide the "USE TEMP B-TREE FOR ORDER BY" parts of its plan?


Also, is there a way to find out which order SQLite thinks it is returning 
the rows in (if any) so that, if one exists, I can construct the order by 
clause that will cause no change to the query plan?




Thanks for any insight and help you can offer me.
I've been a long time user of SQLite so thanks for all the great work and 
documentation over the years!





Here's my schema:

-----
sqlite> .schema
CREATE TABLE IF NOT EXISTS "entry-items" ("log-id"  NOT NULL , 
"entry-number"  NOT NULL , "item-id"  NOT NULL );
CREATE TABLE IF NOT EXISTS "entrys" ("log-id" INTEGER NOT NULL , 
"entry-number" INTEGER NOT NULL , "region" TEXT NOT NULL , "key" TEXT NOT 
NULL , "timestamp" INTEGER NOT NULL , PRIMARY KEY ("log-id", 
"entry-number"));
CREATE TABLE IF NOT EXISTS "item-digests" ("item-id" INTEGER NOT NULL , 
"algorithm" TEXT NOT NULL , "digest" BLOB NOT NULL , PRIMARY KEY 
("item-id", "algorithm"));
CREATE TABLE IF NOT EXISTS "items" ("item-id" INTEGER PRIMARY KEY  NOT NULL 
 UNIQUE , "blob" BLOB NOT NULL  UNIQUE );
CREATE TABLE IF NOT EXISTS "registers" ("log-id" INTEGER PRIMARY KEY  NOT 
NULL  UNIQUE , "index-of" INTEGER NOT NULL , "name" TEXT NOT NULL );
CREATE INDEX "entry-items-log-id-entry-number-item-id" ON "entry-items" 
("log-id" ASC, "entry-number" ASC, "item-id" ASC);
CREATE UNIQUE INDEX "entrys-region-key-entry-number-log-id" ON "entrys" ( 
"region" ASC, "key" ASC, "entry-number" ASC, "log-id" ASC);
CREATE UNIQUE INDEX "item-digests-algorithm-digest" ON "item-digests" 
("algorithm" ASC, "digest" ASC);
CREATE UNIQUE INDEX "registers-index-of-name" ON "registers" ("index-of" 
ASC, "name" ASC);
CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number-desc" ON 
"entrys" ( "log-id" ASC, "region" ASC, "key" ASC, "entry-number" DESC)
-----



Regards,
@ndy

-- 
andy...@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to