Re: [sqlite] Indexing WHERE with GROUP BY and ORDER BY

Hamish Allan Mon, 03 Apr 2017 07:15:58 -0700

This second one also has the advantage that I understand what's being
expressed! 🙂


This works perfectly, thank you! (and thank you to all others that replied).

I'm not sure I understand the indexing, though. The query plan without
indexes is:

0|0|0|SCAN TABLE x AS x1
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE x AS x2

If I subsequently CREATE INDEX b_c_index ON x (b, c):

0|0|0|SEARCH TABLE x AS x1 USING INDEX b_c_index (b=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE x AS x2 USING INDEX b_c_index (b=? AND c=?)

So far so good. But if I additionally CREATE INDEX b_a_index ON x (b, a):

0|0|0|SEARCH TABLE x AS x1 USING INDEX b_a_index (b=?)
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE x AS x2 USING INDEX b_a_index (b=?)

I've not dropped the b_c_index. Why does the subquery no longer use it?

Thanks,
Hamish



On 3 April 2017 at 10:20, R Smith <rsm...@rsweb.co.za> wrote:
> Another option, if a is simple and a suitable index exists, is to simply get
> the max in a correlated subquery, so this should work faster even:
>
> SELECT c,d FROM x AS x1 WHERE b=1 AND a=(SELECT MAX(x2.a) FROM x AS x2 WHERE
> x2.b=x1.b AND x2.c=x1.c)
>
> This needs no grouping because the sub-query ensures it, unless a can have
> duplicate values for any one c value.
>
>
>
>
> On 2017/04/03 11:09 AM, R Smith wrote:
>>
>>
>>
>> On 2017/04/03 10:51 AM, Hamish Allan wrote:
>>>
>>> Ah. My purpose is to determine "d for the most recent c with b=1",
>>> with "most recent" being "largest a".
>>>
>>> My query had been working for this, but I'd only been adding
>>> monotonically increasing values for "a" in my tests. I just tried
>>> doing otherwise and found that I had been relying on an implementation
>>> detail.
>>>
>>> Is there any way I can perform this collation at query time, or do I
>>> need to do it programmatically later on?
>>
>>
>> WITH CLT(d,c,a) AS (
>>   SELECT d,c,a FROM x WHERE b = 1 GROUP BY c,a
>> ), CMX(c,mxa) AS (
>>   SELECT c, MAX(a) FROM CLT GROUP BY c
>> )
>> SELECT d
>>   FROM CLT
>>   JOIN CMX ON CMX.c = CLT.c AND CMX.mxa = CLT.a
>> ;
>>
>> Zero Order-by's
>>
>>
>>
>>>
>>> Thanks,
>>> Hamish
>>>
>>>
>>>
>>>
>>>
>>> On 3 April 2017 at 00:41, Keith Medcalf <kmedc...@dessus.com> wrote:
>>>>
>>>> What is the purpose of the ORDER BY?
>>>>
>>>> The value of a used for the order by is from some random row in the
>>>> grouping of c.  Are there relationships between a, b, c, d that you have 
>>>> not
>>>> documented nor told us about?
>>>>
>>>> In any case, your most efficient index is on (b, c).
>>>>
>>>> The order by is useless.
>>>>
>>>>> -----Original Message-----
>>>>> From: sqlite-users
>>>>> [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
>>>>> On Behalf Of Hamish Allan
>>>>> Sent: Sunday, 2 April, 2017 17:28
>>>>> To: sqlite-users@mailinglists.sqlite.org
>>>>> Subject: [sqlite] Indexing WHERE with GROUP BY and ORDER BY
>>>>>
>>>>> Given a table:
>>>>>
>>>>> CREATE TABLE x (a INT, b INT, c TEXT, d TEXT);
>>>>>
>>>>> the query:
>>>>>
>>>>> SELECT d FROM x WHERE b = 1 GROUP BY c ORDER BY a;
>>>>>
>>>>> shows the following plan, without indexes:
>>>>>
>>>>> 0|0|0|SCAN TABLE x
>>>>> 0|0|0|USE TEMP B-TREE FOR GROUP BY
>>>>> 0|0|0|USE TEMP B-TREE FOR ORDER BY
>>>>>
>>>>> I can create an index to cover the WHERE clause:
>>>>>
>>>>> CREATE INDEX b_index ON x (b);
>>>>>
>>>>> which gives the plan:
>>>>>
>>>>> 0|0|0|SEARCH TABLE x USING INDEX b_index (b=?)
>>>>> 0|0|0|USE TEMP B-TREE FOR GROUP BY
>>>>> 0|0|0|USE TEMP B-TREE FOR ORDER BY
>>>>>
>>>>> or I can create an index to cover the GROUP BY clause:
>>>>>
>>>>> DROP INDEX b_index;
>>>>> CREATE INDEX c_index ON x (c);
>>>>>
>>>>> which gives the plan:
>>>>>
>>>>> 0|0|0|SCAN TABLE x USING INDEX c_index
>>>>> 0|0|0|USE TEMP B-TREE FOR ORDER BY
>>>>>
>>>>> but I don't seem to be able to create a combined index to cover both
>>>>> the WHERE and GROUP BY clauses (let alone the ORDER BY clause).
>>>>>
>>>>> Am I missing something, or is this just not possible?
>>>>>
>>>>> If it's not possible, which is the more efficient of those indexes  --
>>>>> or is there a third way, using an index for the ORDER BY, which would
>>>>> be more efficient still?
>>>>>
>>>>> Thanks,
>>>>> Hamish
>>>>> _______________________________________________
>>>>> sqlite-users mailing list
>>>>> sqlite-users@mailinglists.sqlite.org
>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> sqlite-users@mailinglists.sqlite.org
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users@mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Indexing WHERE with GROUP BY and ORDER BY

Reply via email to