Re: [sqlite] Min/Max and skip-scan optimizations

R Smith Tue, 05 Feb 2019 12:08:48 -0800


On 2019/02/05 4:46 PM, Simon Slavin wrote:

On 5 Feb 2019, at 8:59am, Rowan Worth <row...@dug.com> wrote:


What is stopping sqlite's query planner from taking advantage of the index, 
which it has chosen to use for the query, to also satisfy the ORDER BY?

I suspect that, given the data in the table, the index supplied is not optimal 
for selecting the correct rows from the table.  SQLite may have decided that it 
needs to select on the contents of ts first, then source1.


And to add to this:

An Index is nothing magical and not a save-the-World-from-every-monstertype of device (as newer DB programmers often think). It's an expensiveadd-on that provides an ordered binary lookup which, given enough bulk,will eventually win the efficiency race over the extra computation itadds. (The more bulk, the more win).(Some DB programmers, when they see the words "table scan" in any Queryplan, immediately feel as if they have somehow failed to correctlyoptimize the query. This is silly - a table scan is often the mostoptimal solution).

Add to that the fact that an SQLite TABLE is, in and of itself, nothingless than a covering Index with row_id as a key (or a custom key forWITHOUT ROWID tables), and as such it is a rather good Index and amostly preferred Index by the query planner (because "using" any otherindex adds cycles plus an extra row_id lookup). Due to this, scanningthe table is often more efficient than threading a lookup via anotherindex into the query plan. Sometimes crafting a new temp BTree Index for(a) specific field(s) on a materialized set of data might also be judgedfaster than re-establishing links between said data and its original Index.

The method by which the query planner decides which other Index (if any)should be used involves a bit of game theory, typically looking at someANALYZE result data along with with some tried and tested weights in thedecision tree (which I'm not going into since A - It's not important,and B - I don't know enough of how SQLite does it). If the end scorefinds that there is no remarkable advantage to using a separate index,then it WILL opt to use the more-efficient table scan.

It might be that the adding of the "ORDER BY" simply pushes one suchdecision weight over the edge in this use case, and, once the table dataevolved to be more complex or hefty, it may again turn to the Index.

To add to another poster's comment: Do not second-guess theQuery-planner, leave it to its devices. You may even be able toconstruct a scenario where the specific use case causes the QP to choosean execution path that is slightly slower than an alternate one, but ifit is looked at in the general case, then other similar query scenariosmight again be faster with that chosen path. Further to this, if youconstruct a weird query now to force a path of execution with some gain,you possibly prohibit it from capitalizing on an even better improvementthat might be inherent to the next SQLite update (possibly thanks toyour very own report here).

If you can demonstrate a true degradation (one that slows down asignificant time slice that trespasses on human-perceptible time) for ageneral query, an optimization will surely be considered, but this case,unless I've misunderstood the severity, does not seem to warrant that.

[PS: this is not a discouragement, it's great to hear of every possiblequirk and make other users aware of a possible query scenario that mightnot be optimal - thanks for that, and I'm certain the devs would noticethis, perhaps even get on fixing it right away, or maybe only keep it inthe back of their minds for when the next round of query-plannerrefinement happens. I'm simply saying that there is possibly nosatisfying answer to your question right now - best we can do is:"Sometimes the QP correctly evaluates the best path to be one that isnot obviously best to us, or maybe even worse for a specific case, buttypically better in the general case".]



Cheers,
Ryan


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Min/Max and skip-scan optimizations

Reply via email to