On 2019/02/05 4:46 PM, Simon Slavin wrote:
On 5 Feb 2019, at 8:59am, Rowan Worth <row...@dug.com> wrote:
What is stopping sqlite's query planner from taking advantage of the index,
which it has chosen to use for the query, to also satisfy the ORDER BY?
I suspect that, given the data in the table, the index supplied is not optimal
for selecting the correct rows from the table. SQLite may have decided that it
needs to select on the contents of ts first, then source1.
And to add to this:
An Index is nothing magical and not a save-the-World-from-every-monster
type of device (as newer DB programmers often think). It's an expensive
add-on that provides an ordered binary lookup which, given enough bulk,
will eventually win the efficiency race over the extra computation it
adds. (The more bulk, the more win).
(Some DB programmers, when they see the words "table scan" in any Query
plan, immediately feel as if they have somehow failed to correctly
optimize the query. This is silly - a table scan is often the most
optimal solution).
Add to that the fact that an SQLite TABLE is, in and of itself, nothing
less than a covering Index with row_id as a key (or a custom key for
WITHOUT ROWID tables), and as such it is a rather good Index and a
mostly preferred Index by the query planner (because "using" any other
index adds cycles plus an extra row_id lookup). Due to this, scanning
the table is often more efficient than threading a lookup via another
index into the query plan. Sometimes crafting a new temp BTree Index for
(a) specific field(s) on a materialized set of data might also be judged
faster than re-establishing links between said data and its original Index.
The method by which the query planner decides which other Index (if any)
should be used involves a bit of game theory, typically looking at some
ANALYZE result data along with with some tried and tested weights in the
decision tree (which I'm not going into since A - It's not important,
and B - I don't know enough of how SQLite does it). If the end score
finds that there is no remarkable advantage to using a separate index,
then it WILL opt to use the more-efficient table scan.
It might be that the adding of the "ORDER BY" simply pushes one such
decision weight over the edge in this use case, and, once the table data
evolved to be more complex or hefty, it may again turn to the Index.
To add to another poster's comment: Do not second-guess the
Query-planner, leave it to its devices. You may even be able to
construct a scenario where the specific use case causes the QP to choose
an execution path that is slightly slower than an alternate one, but if
it is looked at in the general case, then other similar query scenarios
might again be faster with that chosen path. Further to this, if you
construct a weird query now to force a path of execution with some gain,
you possibly prohibit it from capitalizing on an even better improvement
that might be inherent to the next SQLite update (possibly thanks to
your very own report here).
If you can demonstrate a true degradation (one that slows down a
significant time slice that trespasses on human-perceptible time) for a
general query, an optimization will surely be considered, but this case,
unless I've misunderstood the severity, does not seem to warrant that.
[PS: this is not a discouragement, it's great to hear of every possible
quirk and make other users aware of a possible query scenario that might
not be optimal - thanks for that, and I'm certain the devs would notice
this, perhaps even get on fixing it right away, or maybe only keep it in
the back of their minds for when the next round of query-planner
refinement happens. I'm simply saying that there is possibly no
satisfying answer to your question right now - best we can do is:
"Sometimes the QP correctly evaluates the best path to be one that is
not obviously best to us, or maybe even worse for a specific case, but
typically better in the general case".]
Cheers,
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users