Re: [sqlite] Missed index opportunities with sorting?

2019-12-09 Thread Graham Holden
Monday, December 09, 2019, 1:32:40 PM, Digital Dog  
wrote:

> On Sat, Dec 7, 2019 at 3:50 AM Simon Slavin  wrote:

>> On 7 Dec 2019, at 2:26am, Shawn Wagner  wrote:
>>
>> > The first one uses the index for all sorting, but the second one only
>> uses it for sorting a, not b. I feel like the descending sort could make
>> use of the index too, just reading the b sections backwards to get the
>> right order. Is there something I'm overlooking that would make this sort
>> of optimization impractical or otherwise a bad idea?
>>
>> Hmm.  Try running ANALYZE and then doing the EXPLAIN QUERY PLAN lines
>> again.
>>
>> But I think that without 'chunkiness' information (how many values columns
>> a and b have) it would not be worth doing the complicated programming
>> required for reverse-mini-scanning of that index.  The programming is quite
>> complicated and unless your index "b" is chunky it won't save you much time
>> over the plan shown.
>>

> So it's better to allocate memory, block the execution until all rows are
> read and use cpu time to do unnecessary sorting that could easily be
> avoided by just reading index backwards?

It should only need to collect-and-sort groups of rows with equal "a"
values, rather than ALL rows (and the EXPLAIN QUERY PLAN seems to
support this). In theory, it should be possible to emit rows in these
groups, but I don't know enough about reading VDBE (output of EXPLAIN
SELECT...) to know if it does this or accumulates ALL rows before
emitting the first.

>   Is it really so hard to program
> it? I do not think so.

I've no idea. If reverse-mini-scanning an index ISN'T "so hard", then
from past experience there's a moderate chance one of the devs is
looking to see if it CAN be done. However, the fact that it HASN'T
been done, when reverse-mini-scanning seems an "obvious" optimisation,
suggests to me it is not as easy as one might think, and that the
potential saving isn't that great. Especially when you could always
create a second index on "a, b DESC".

> However the heuristic to decide when to do backward index scan needs to be
> smart enough to select this only as last resort optimization, just before
> falling back to explicit sort.

This "decision problem" is -- I believe -- a key factor in deciding
whether to add any specific optimisation. For every (potential)
optimisation that could be added, you need to add code that decides
whether it's worth using that optimisation. The savings from USING the
optimisation only benefit SOME queries, but the code that decides
whether or not to USE the optimisation has to be executed for many,
many more queries. The risk is that you slow lots and lots of queries
down a little bit, for only an occasional gain for the few queries
where the optimisation does make sense.

I don't know the actual code, but I'm guessing the optimiser never
really "knows" there is an index where doing a "reverse-mini-scan"
could help. I suspect it (a) realises there isn't a "prefect" index;
(b) finds an index that sorts as much as possible (in this case, the
only index), and (c) fulfils the rest of the ORDER BY requirements
using a temp b-tree. (With, probably, some exceptions for
optimisations that HAVE been implemented). 

To implement a reverse-mini-scan, not only would you have to add the
code to do the reverse scan itself, but the code that does (b) above
(find AN index that sorts as much as possible) would need to be much
more complex and consider ALL indices that start by ordering "a ASC"
to see if any of THOSE would allow a reverse-scan. It might even need
to see if a "seemingly worse" index (+reverse-scan) might be better
than an "obviously better" index. For instance, with more fields, an
"ORDER BY a, b DESC, c" might be better served by an index on
"( a, b, c )" with reverse-scan than a seemingly better index on
"( a, b DESC, d ).

Graham


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Missed index opportunities with sorting?

2019-12-09 Thread Digital Dog
On Sat, Dec 7, 2019 at 3:50 AM Simon Slavin  wrote:

> On 7 Dec 2019, at 2:26am, Shawn Wagner  wrote:
>
> > The first one uses the index for all sorting, but the second one only
> uses it for sorting a, not b. I feel like the descending sort could make
> use of the index too, just reading the b sections backwards to get the
> right order. Is there something I'm overlooking that would make this sort
> of optimization impractical or otherwise a bad idea?
>
> Hmm.  Try running ANALYZE and then doing the EXPLAIN QUERY PLAN lines
> again.
>
> But I think that without 'chunkiness' information (how many values columns
> a and b have) it would not be worth doing the complicated programming
> required for reverse-mini-scanning of that index.  The programming is quite
> complicated and unless your index "b" is chunky it won't save you much time
> over the plan shown.
>


So it's better to allocate memory, block the execution until all rows are
read and use cpu time to do unnecessary sorting that could easily be
avoided by just reading index backwards? Is it really so hard to program
it? I do not think so.

However the heuristic to decide when to do backward index scan needs to be
smart enough to select this only as last resort optimization, just before
falling back to explicit sort.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Missed index opportunities with sorting?

2019-12-06 Thread Simon Slavin
On 7 Dec 2019, at 2:26am, Shawn Wagner  wrote:

> The first one uses the index for all sorting, but the second one only uses it 
> for sorting a, not b. I feel like the descending sort could make use of the 
> index too, just reading the b sections backwards to get the right order. Is 
> there something I'm overlooking that would make this sort of optimization 
> impractical or otherwise a bad idea?

Hmm.  Try running ANALYZE and then doing the EXPLAIN QUERY PLAN lines again.

But I think that without 'chunkiness' information (how many values columns a 
and b have) it would not be worth doing the complicated programming required 
for reverse-mini-scanning of that index.  The programming is quite complicated 
and unless your index "b" is chunky it won't save you much time over the plan 
shown.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Missed index opportunities with sorting?

2019-12-06 Thread Shawn Wagner
Consider:

sqlite> CREATE TABLE test1(a, b);
sqlite> CREATE INDEX test1_idx ON test1(a, b ASC);
sqlite> EXPLAIN QUERY PLAN SELECT * FROM test1 ORDER BY a, b;
QUERY PLAN
`--SCAN TABLE test1 USING COVERING INDEX test1_idx
sqlite> EXPLAIN QUERY PLAN SELECT * FROM test1 ORDER BY a, b DESC;
QUERY PLAN
|--SCAN TABLE test1 USING COVERING INDEX test1_idx
`--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY

The first one uses the index for all sorting, but the second one only uses
it for sorting a, not b. I feel like the descending sort could make use of
the index too, just reading the b sections backwards to get the right
order. Is there something I'm overlooking that would make this sort of
optimization impractical or otherwise a bad idea?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users