Like this?
SELECT rolling.source1,
rolling.source2,
ts,
value
FROM (
select distinct source1,
source2
from rolling
where source1 = 'aaa'
) as x
JOIN rolling
ON rolling.source1 = x.source1
AND rolling.source2 = x.source2
WHERE ts > 1
AND ts < 10
ORDER BY 1,2,3;
---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a
lot about anticipated traffic volume.
>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[email protected]] On Behalf Of Gerlando Falauto
>Sent: Saturday, 2 February, 2019 15:20
>To: SQLite mailing list
>Subject: Re: [sqlite] Min/Max and skip-scan optimizations
>
>Hi,
>it's me again, struggling with indices once again.
>What I'm now trying to do is filter on source1 and by range of
>timestamp.
>Results should be naturally ordered by source1, source2,ts.
>
>1)
>
>SELECT source1, source2, ts, value
>FROM rolling
>WHERE source1 = 'aaa'
>AND ts > 1 AND ts < 10;
>
>QUERY PLAN
>`--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND
>ANY(source2)
>AND ts>? AND ts<?)
>
>This looks pretty OK on the sample dataset.
>For some reason though this doesn't really work like that with the
>real
>dataset (which has an extra index on ts only),
>and that seems to kick in instead of the real index.
>So the query plan comes out completely different, and rows are not
>naturally sorted by source1, source2, as I'd like.
>[I know I can use "+ts" and/or drop the "ts" index altogether, but
>I'm
>trying to make a point here...]
>So I add an extra ORDER BY clause:
>
>2)
>
>SELECT source1, source2, ts, value
>FROM rolling
>WHERE source1 = 'aaa'
> AND ts > 1 AND ts < 100000000
>ORDER BY source1, source2, ts;
>
>QUERY PLAN
>|--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND
>ANY(source2)
>AND ts>? AND ts<?)
>`--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY
>
>Here I don't really understand why a TEMP B-TREE is required at all.
>Apparently, this adds extra processing at the end so I don't start
>pulling
>any rows until much later.
>If the index is used, data would be *naturally* sorted by its key, so
>I
>don't really understand.
>So I recur to a subquery:
>
>3)
>
>SELECT source1, source2, ts, value
>FROM rolling
>WHERE ((source1, source2) in
> (SELECT DISTINCT source1, source2 from rolling where
>source1='aaa')
> AND ts > 1 AND ts < 10)
>
>QUERY PLAN
>|--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=?
>AND
>ts>? AND ts<?)
>`--LIST SUBQUERY
> `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?)
>
>This also seems to yield the right results, in the right order.
>But the order looks pretty much like a coincidence to me.
>So I'd rather explicitly state the sorting I'd like:
>
>4)
>
>SELECT source1, source2, ts, value
>FROM rolling
>WHERE ((source1, source2) in
> (SELECT DISTINCT source1, source2 from rolling where
>source1='aaa')
> AND ts > 1 AND ts < 10)
>ORDER BY source1, source2, ts
>
>QUERY PLAN
>|--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=?
>AND
>ts>? AND ts<?)
>|--LIST SUBQUERY
>| `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?)
>`--USE TEMP B-TREE FOR ORDER BY
>
>Here the b-tree is *NOT* for "RIGHT PART OF" ORDER BY, it seems like
>some
>global sorting
>which looks even worse than case 2.
>
>What am I doing wrong? Is this expected?
>I just don't seem to be able to get what would have been a pretty
>trivial
>task with
>last-century technologies (thinking of Paradox, dBase, CA-Clipper)...
>
>Thanks in advance!
>Gerlando
>
>
>On Mon, Jan 28, 2019 at 9:11 AM Gerlando Falauto
><[email protected]>
>wrote:
>
>> YES! Thank you!
>> Many thanks for the ".eqp full" tip also, that really explains a
>lot
>> (though I don't really understand any of it yet).
>>
>> Have a great day!
>> Gerlando
>>
>>
>> On Mon, Jan 28, 2019 at 6:50 AM Keith Medcalf <[email protected]>
>wrote:
>>
>>>
>>> Do you perhaps want this:
>>>
>>> select source1,
>>> source2,
>>> (
>>> select min(ts)
>>> from rolling
>>> where source1 = x.source1
>>> and source2 = x.source2
>>> )
>>> from (
>>> select distinct source1,
>>> source2
>>> from rolling
>>> ) as x;
>>>
>>> SQLite version 3.27.0 2019-01-28 00:42:06
>>> Enter ".help" for usage hints.
>>> Connected to a transient in-memory database.
>>> Use ".open FILENAME" to reopen on a persistent database.
>>> sqlite> .timer on
>>> sqlite> CREATE TABLE `rolling` (
>>> ...> `source1` TEXT NOT NULL,
>>> ...> `source2` TEXT NOT NULL,
>>> ...> `ts` INTEGER NOT NULL,
>>> ...> `value` TEXT
>>> ...> );
>>> Run Time: real 0.002 user 0.000000 sys 0.000000
>>> sqlite>
>>> sqlite> CREATE INDEX `sources` ON `rolling` (
>>> ...> `source1`,
>>> ...> `source2`,
>>> ...> `ts`
>>> ...> );
>>> Run Time: real 0.001 user 0.000000 sys 0.000000
>>> sqlite>
>>> sqlite> INSERT INTO rolling
>>> ...> WITH RECURSIVE
>>> ...> src1( source1 ) AS ( VALUES("aaa") UNION ALL
>VALUES("bbb")
>>> ),
>>> ...> src2( source2 ) AS ( VALUES("X1") UNION ALL
>VALUES("X2")
>>> UNION ALL
>>> ...> VALUES("X3") UNION ALL VALUES("X4") ),
>>> ...> cnt( ts, value) AS (
>>> ...> VALUES( 0, "ZZZZ")
>>> ...> UNION ALL
>>> ...> SELECT ts+1, value FROM cnt LIMIT 1000000)
>>> ...>
>>> ...> select src1.source1, src2.source2, cnt.* from src1,
>src2, cnt;
>>> Run Time: real 8.920 user 8.843750 sys 0.078125
>>> sqlite>
>>> sqlite> analyze;
>>> Run Time: real 1.285 user 1.281250 sys 0.000000
>>> sqlite> .eqp full
>>> sqlite>
>>> sqlite> select source1,
>>> ...> source2,
>>> ...> (
>>> ...> select min(ts)
>>> ...> from rolling
>>> ...> where source1 = x.source1
>>> ...> and source2 = x.source2
>>> ...> )
>>> ...> from (
>>> ...> select distinct source1,
>>> ...> source2
>>> ...> from rolling
>>> ...> ) as x;
>>> QUERY PLAN
>>> |--CO-ROUTINE 2
>>> | `--SCAN TABLE rolling USING COVERING INDEX sources (~7864320
>rows)
>>> |--SCAN SUBQUERY 2 AS x (~7864320 rows)
>>> `--CORRELATED SCALAR SUBQUERY 1
>>> `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?
>AND
>>> source2=?) (~983040 rows)
>>> addr opcode p1 p2 p3 p4 p5 comment
>>> ---- ------------- ---- ---- ---- ------------- -- --------
>-----
>>> 0 Init 0 64 0 00 Start at
>64
>>> 1 InitCoroutine 1 23 2 00 x
>>> 2 Null 1 4 0 08
>r[4]=NULL
>>> 3 OpenRead 4 3 0 k(4,,,,) 00 root=3
>iDb=0;
>>> sources
>>> 4 ColumnsUsed 4 0 0 3 00
>>> 5 Explain 5 0 0 SCAN TABLE rolling USING
>COVERING
>>> INDEX sources (~7864320 rows) 00
>>> 6 Noop 0 0 0 00 Begin
>>> WHERE-loop0: rolling
>>> 7 Rewind 4 21 2 0 00
>>> 8 Noop 0 0 0 00
>Begin
>>> WHERE-core
>>> 9 Column 4 0 2 00
>>> r[2]=rolling.source1
>>> 10 Column 4 1 3 00
>>> r[3]=rolling.source2
>>> 11 Ne 2 13 4 (BINARY) 80 if
>>> r[4]!=r[2] goto 13
>>> 12 Eq 3 20 5 (BINARY) 80 if
>>> r[5]==r[3] goto 20
>>> 13 Copy 2 4 1 00
>>> r[4..5]=r[2..3]
>>> 14 Yield 1 0 0 00
>>> 15 Noop 0 0 0 00 End
>>> WHERE-core
>>> 16 Column 4 0 6 00
>r[6]=
>>> 17 Column 4 1 7 00
>r[7]=
>>> 18 SeekGT 4 21 6 2 00
>key=r[6..7]
>>> 19 Goto 1 8 0 00
>>> 20 Next 4 8 0 01
>>> 21 Noop 0 0 0 00 End
>>> WHERE-loop0: rolling
>>> 22 EndCoroutine 1 0 0 00
>>> 23 Explain 23 0 0 SCAN SUBQUERY 2 AS x
>(~7864320
>>> rows) 00
>>> 24 Noop 0 0 0 00 Begin
>>> WHERE-loop0: x
>>> 25 InitCoroutine 1 0 2 00
>>> 26 Yield 1 62 0 00 next
>row of x
>>> 27 Noop 0 0 0 00 Begin
>>> WHERE-core
>>> 28 Copy 2 9 0 00
>r[9]=r[2];
>>> x.source1
>>> 29 Copy 3 10 0 00
>r[10]=r[3];
>>> x.source2
>>> 30 Null 0 12 12 00
>>> r[12..12]=NULL; Init subquery result
>>> 31 Integer 1 13 0 00
>r[13]=1;
>>> LIMIT counter
>>> 32 Null 0 14 15 00
>r[14..15]=NULL
>>> 33 OpenRead 5 3 0 k(4,,,,) 00 root=3
>iDb=0;
>>> sources
>>> 34 ColumnsUsed 5 0 0 7 00
>>> 35 Explain 35 0 0 SEARCH TABLE rolling
>USING
>>> COVERING INDEX sources (source1=? AND source2=?) (~983040 rows)
>00
>>> 36 Noop 0 0 0 00 Begin
>>> WHERE-loop0: rolling
>>> 37 Copy 2 16 0 00
>r[16]=r[2];
>>> x.source1
>>> 38 Copy 3 17 0 00
>r[17]=r[3];
>>> x.source2
>>> 39 CursorHint 5 0 0
>AND(EQ(c0,r[16]),EQ(c1,r[17]))
>>> 00
>>> 40 Copy 2 18 0 00
>r[18]=r[2];
>>> x.source1
>>> 41 IsNull 18 54 0 00 if
>>> r[18]==NULL goto 54
>>> 42 Copy 3 19 0 00
>r[19]=r[3];
>>> x.source2
>>> 43 IsNull 19 54 0 00 if
>>> r[19]==NULL goto 54
>>> 44 Null 0 20 0 00
>r[20]=NULL
>>> 45 SeekGT 5 54 18 3 00
>key=r[18..20]
>>> 46 IdxGT 5 54 18 2 00
>>> key=r[18..19]
>>> 47 Noop 0 0 0 00
>Begin
>>> WHERE-core
>>> 48 Column 5 2 21 00
>>> r[21]=rolling.ts
>>> 49 CollSeq 0 0 0 (BINARY) 00
>>> 50 AggStep 0 21 14 min(1) 01
>accum=r[14]
>>> step(r[21])
>>> 51 Goto 0 55 0 00
>min() by
>>> index
>>> 52 Noop 0 0 0 00 End
>>> WHERE-core
>>> 53 Next 5 46 0 00
>>> 54 Noop 0 0 0 00 End
>>> WHERE-loop0: rolling
>>> 55 AggFinal 14 1 0 min(1) 00
>accum=r[14]
>>> N=1
>>> 56 Copy 14 12 0 00
>r[12]=r[14]
>>> 57 DecrJumpZero 13 58 0 00 if
>>> (--r[13])==0 goto 58
>>> 58 Copy 12 11 0 00
>r[11]=r[12]
>>> 59 ResultRow 9 3 0 00
>>> output=r[9..11]
>>> 60 Noop 0 0 0 00 End
>WHERE-core
>>> 61 Goto 0 26 0 00
>>> 62 Noop 0 0 0 00 End
>>> WHERE-loop0: x
>>> 63 Halt 0 0 0 00
>>> 64 Transaction 0 0 3 0 01
>>> usesStmtJournal=0
>>> 65 Goto 0 1 0 00
>>> aaa|X1|0
>>> aaa|X2|0
>>> aaa|X3|0
>>> aaa|X4|0
>>> bbb|X1|0
>>> bbb|X2|0
>>> bbb|X3|0
>>> bbb|X4|0
>>> Run Time: real 0.134 user 0.000000 sys 0.000000
>>> sqlite>
>>>
>>> ---
>>> The fact that there's a Highway to Hell but only a Stairway to
>Heaven
>>> says a lot about anticipated traffic volume.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [email protected]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>>>
>>
>_______________________________________________
>sqlite-users mailing list
>[email protected]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users