Hello SQLiters,
I've recently run into a couple things which I found odd with respect to how
SQLite decided to execute a particular query (and those with some similar
characteristics). These oddities appear in both in how the query planner
works
and in the generated opcodes. I profess the opposite of sophistication when
it
comes to databases (and many other things as well), so I've come here to
offer
my findings and ask, with all humility, some questions regarding them.
This is the pared-down query that my questions center around, but my
questions
aren't necessarily specific to this query:
SELECT a.v_id
FROM a, l, v, m
WHERE l.l_cid = 300
AND l.l_flag_id = 5
AND v.id = l.v_id
AND a.v_id = v.id
LIMIT 10;
>From ANALYZE:
m is 100 rows
l is 65,000,000 rows
v is 220,000 rows
a is 850,000 rows
I'll start with what should end up requiring a simpler answer. I ran the
query
with SQLite versions 3.8.5 and SQLite 3.7.13 and noticed a massive time
difference between the two (where on 3.7.13 the time was on the order of
seconds, and on 3.8.5 I let it run for 50 minutes before giving up hope and
deciding to give my HD a break). I ran EXPLAIN QUERY PLAN and got the
below. Note that I have no indices at this point and have not yet run
ANALYZE
(i.e. sqlite_stat1 is empty).
>From v3.7.13
order from detail
----- -----
--------------------------------------------------------------------
0 1 SCAN TABLE l (~10000 rows)
1 2 SEARCH TABLE v USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
2 0 SEARCH TABLE a USING AUTOMATIC COVERING INDEX (v_id=?) (~7
rows)
3 3 SCAN TABLE m (~1000000 rows)
>From v3.8.5
order from detail
----- ------
-------------------------------------------------------------------
0 1 SEARCH TABLE l
USING AUTOMATIC COVERING INDEX (l_cid=? AND
l_flag_id=?)
1 2 SEARCH TABLE v USING INTEGER PRIMARY KEY (rowid=?)
2 3 SCAN TABLE m
3 0 SEARCH TABLE a USING AUTOMATIC COVERING INDEX (v_id=?)
We no longer see the estimates of table lengths (how *are* those estimated,
by
the way?) in the newer version, and the query planner (QP) has determined
that
it makes sense to generate an automatic covering index. After running
ANALYZE
and generating the stats table, both versions agree on the query plan; 3.8.5
remains essentially the same, and 3.7.13 mirrors it exactly. This also means
that 3.7.13 no longer completes the query (in the ~50 minutes I gave it to
run);
the opcodes generated by both versions are very similar, and what's going on
under the hood doesn't change (this at least is good; it means things are
consistent!)
I'm not clear on why an automatic covering index is determined to be the
most
efficient thing to do here; the creation of such an index would take longer
than
the query itself (in fact, testing this by explicitly creating a covering
index
[with e.g. CREATE INDEX ON ...] takes roughly 5 minutes). The query as run
on
3.7.15 takes about 5 seconds with no indices. The big O back-of-the-napkin
calculation confirms this; creating the index (O(NlogN)) and then using it
(some
~O(logN)) ends up being asymptotically way more expensive than simply
scanning the
table (O(N)) (and, practically, WAY more expensive). Check me on this,
because
I'm not 100% that I got these O's right;
So that's my first question; why is an index determined to be the right
thing to
do, here? In most cases, does this end up being advantageous? Is there some
information the QP doesn't have that it shouldn't?
My second question and problem is a bit more severe, I think. Why does
generating the automatic covering index take over 50 minutes when the CREATE
INDEX query takes 5 minutes to run? I looked into this a bit, learning a lot
about SQLite opcodes, reading the source, and running a lot of DTrace, but I
wanted to make sure what I think I found is correct, and to understand the
reasoning behind it (or help out if there is none!)
The opcodes from EXPLAIN [query] are rather long, so I've included the full
readout here[1]. The relevant parts are below.
7 OpenAutoindex 4 10 0 k(10,nil,n.. 00
8 Rewind 1 22 0 00
9 Column 1 4 3 00
10 Column 1 5 4 00
11 Rowid 1 12 0 00
12 MakeRecord 3 4 2 00
13 IdxInsert 4 2 0 10
14 Next 1 9 0 03
The below is a bit of the EXPLAIN [query] when creating the index by hand
with
CREATE INDEX on the same two columns (full[2]):
12 SorterOpen 3 0 0 k(3,nil,ni.. 00
13 OpenRead 1 214248 0 9 00
14 Rewind 1 21 0 00
15 Column 1 4 10 00
16 Column 1 5 11 00
17 Rowid 1 12 0 00
18 MakeRecord 10 3 9 00
19 SorterInsert 3 9 0 00
20 Next 1 15 0 00
21 OpenWrite 2 1 0 k(3,nil,ni.. 03
22 SorterSort 3 26 0 00
23 SorterData 3 9 0 00
24 IdxInsert 2 9 1 10
25 SorterNext 3 23 0 00
The big difference here is that it's using an AutoIndex (OpenAutoIndex),
whereas
the CREATE INDEX version uses a Sorter (SorterOpen). At first blush, this
makes
sense; AutoIndex is for an automatic index, after all.
But when you look at the code and at the traces of sys and lib calls spit
out by
DTrace (which is such a cool piece of software, by the way), things get real
bad, real fast. Here are some samples from the automatic index creation[3],
and
here are some from the explicit index creation[4].
It turns out that AutoIndex is just directly creating a table on disk (a
table
which, for all intents and purposes, is destroyed at the end of the query).
And
then it's inserting into the table the rows of the index. One by one. And
since
a table is a B tree, it needs to be balanced after every insert. And that
rebalancing happens on disk. Because it's inserting unsorted data, it must
rebalance and write to disk a lot.
The trace for this shows a lot of seeks, reads, writes after every B tree
insert
and B tree sort. That's a ton of bad mojo happening in my SSD. And it's
absurdly
slow.
Conversely, opening a Sorter just adds these rows to a in-memory (spilling
to
disk) buffer, unsorted. Once the rows are in, an external merge sort takes
place. Finally, these sorted records are appended to a B tree; since they
are
sorted, the tree does not need to be balanced. This is the reasonable way to
handle the situation, and consequently it happens in over an order of
magnitude
less time (or even less; I have not yet witnessed the end of the autoindex).
Now, to be clear, once there's an index on this table, both 3.7.13 and
3.8.5 run
the same, very snappy queries. However, there shouldn't need to be an index
(the
query runs in 3 seconds when the QP doesn't make an autoindex). I'm also
aware
that I can disable the autoindexer. Needless to say, I've created an index
and
things are snappy in actual use for me. The query above also clearly isn't a
useful one; I'll just reiterate that it was based off of a much larger query
that I pared down to exhibit the same behavior but resulting in fewer than
100
opcodes to step through & debug.
To summarize I ran into two main problems I don't understand the reasoning
for:
1. An autoindex is suggested by the new QP, or by the old QP once the stats
table has been created (with ANALYZE).
2. The way an autoindex is created appears to be massively inefficient.
A potential solution to (1) would be to look at the big O time of the index
creating and subsequently follow the results of that more closely. I'm not
really sure what's going on here, and had to stop working on this after a
few
too many days!
One potential solution to (2), if necessary, would be to use the same code
that
generated an index when one is explicitly created (that is, using a Sorter
with
an external sort before inserting into an index) instead of an inefficient
AutoIndex.
I have to say, I love SQLite. It was a lot of fun reading through the truly
excellent documentation and the source was easy enough to read given that I
don't really write any C.
I'm really curious what you much more experienced people think about the
above.
Thanks for your time,
Isaac
[1]: https://gist.github.com/ihodes/924cb35f933530bd089f
[2]: https://gist.github.com/ihodes/371eedab3f2a255a39bc
[3]: https://gist.github.com/ihodes/4e334d8f29fe611b1da1
[4]: https://gist.github.com/ihodes/ad427f9927919337222f
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users