Thanks Keith, I'll give it a go and let you know! I still don't get how that differs from 2) or 4) below, though.
Thanks again! Gerlando Il dom 3 feb 2019, 00:27 Keith Medcalf <[email protected]> ha scritto: > > Like this? > > SELECT rolling.source1, > rolling.source2, > ts, > value > FROM ( > select distinct source1, > source2 > from rolling > where source1 = 'aaa' > ) as x > JOIN rolling > ON rolling.source1 = x.source1 > AND rolling.source2 = x.source2 > WHERE ts > 1 > AND ts < 10 > ORDER BY 1,2,3; > > > --- > The fact that there's a Highway to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > >-----Original Message----- > >From: sqlite-users [mailto:sqlite-users- > >[email protected]] On Behalf Of Gerlando Falauto > >Sent: Saturday, 2 February, 2019 15:20 > >To: SQLite mailing list > >Subject: Re: [sqlite] Min/Max and skip-scan optimizations > > > >Hi, > >it's me again, struggling with indices once again. > >What I'm now trying to do is filter on source1 and by range of > >timestamp. > >Results should be naturally ordered by source1, source2,ts. > > > >1) > > > >SELECT source1, source2, ts, value > >FROM rolling > >WHERE source1 = 'aaa' > >AND ts > 1 AND ts < 10; > > > >QUERY PLAN > >`--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND > >ANY(source2) > >AND ts>? AND ts<?) > > > >This looks pretty OK on the sample dataset. > >For some reason though this doesn't really work like that with the > >real > >dataset (which has an extra index on ts only), > >and that seems to kick in instead of the real index. > >So the query plan comes out completely different, and rows are not > >naturally sorted by source1, source2, as I'd like. > >[I know I can use "+ts" and/or drop the "ts" index altogether, but > >I'm > >trying to make a point here...] > >So I add an extra ORDER BY clause: > > > >2) > > > >SELECT source1, source2, ts, value > >FROM rolling > >WHERE source1 = 'aaa' > > AND ts > 1 AND ts < 100000000 > >ORDER BY source1, source2, ts; > > > >QUERY PLAN > >|--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND > >ANY(source2) > >AND ts>? AND ts<?) > >`--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY > > > >Here I don't really understand why a TEMP B-TREE is required at all. > >Apparently, this adds extra processing at the end so I don't start > >pulling > >any rows until much later. > >If the index is used, data would be *naturally* sorted by its key, so > >I > >don't really understand. > >So I recur to a subquery: > > > >3) > > > >SELECT source1, source2, ts, value > >FROM rolling > >WHERE ((source1, source2) in > > (SELECT DISTINCT source1, source2 from rolling where > >source1='aaa') > > AND ts > 1 AND ts < 10) > > > >QUERY PLAN > >|--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=? > >AND > >ts>? AND ts<?) > >`--LIST SUBQUERY > > `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?) > > > >This also seems to yield the right results, in the right order. > >But the order looks pretty much like a coincidence to me. > >So I'd rather explicitly state the sorting I'd like: > > > >4) > > > >SELECT source1, source2, ts, value > >FROM rolling > >WHERE ((source1, source2) in > > (SELECT DISTINCT source1, source2 from rolling where > >source1='aaa') > > AND ts > 1 AND ts < 10) > >ORDER BY source1, source2, ts > > > >QUERY PLAN > >|--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=? > >AND > >ts>? AND ts<?) > >|--LIST SUBQUERY > >| `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?) > >`--USE TEMP B-TREE FOR ORDER BY > > > >Here the b-tree is *NOT* for "RIGHT PART OF" ORDER BY, it seems like > >some > >global sorting > >which looks even worse than case 2. > > > >What am I doing wrong? Is this expected? > >I just don't seem to be able to get what would have been a pretty > >trivial > >task with > >last-century technologies (thinking of Paradox, dBase, CA-Clipper)... > > > >Thanks in advance! > >Gerlando > > > > > >On Mon, Jan 28, 2019 at 9:11 AM Gerlando Falauto > ><[email protected]> > >wrote: > > > >> YES! Thank you! > >> Many thanks for the ".eqp full" tip also, that really explains a > >lot > >> (though I don't really understand any of it yet). > >> > >> Have a great day! > >> Gerlando > >> > >> > >> On Mon, Jan 28, 2019 at 6:50 AM Keith Medcalf <[email protected]> > >wrote: > >> > >>> > >>> Do you perhaps want this: > >>> > >>> select source1, > >>> source2, > >>> ( > >>> select min(ts) > >>> from rolling > >>> where source1 = x.source1 > >>> and source2 = x.source2 > >>> ) > >>> from ( > >>> select distinct source1, > >>> source2 > >>> from rolling > >>> ) as x; > >>> > >>> SQLite version 3.27.0 2019-01-28 00:42:06 > >>> Enter ".help" for usage hints. > >>> Connected to a transient in-memory database. > >>> Use ".open FILENAME" to reopen on a persistent database. > >>> sqlite> .timer on > >>> sqlite> CREATE TABLE `rolling` ( > >>> ...> `source1` TEXT NOT NULL, > >>> ...> `source2` TEXT NOT NULL, > >>> ...> `ts` INTEGER NOT NULL, > >>> ...> `value` TEXT > >>> ...> ); > >>> Run Time: real 0.002 user 0.000000 sys 0.000000 > >>> sqlite> > >>> sqlite> CREATE INDEX `sources` ON `rolling` ( > >>> ...> `source1`, > >>> ...> `source2`, > >>> ...> `ts` > >>> ...> ); > >>> Run Time: real 0.001 user 0.000000 sys 0.000000 > >>> sqlite> > >>> sqlite> INSERT INTO rolling > >>> ...> WITH RECURSIVE > >>> ...> src1( source1 ) AS ( VALUES("aaa") UNION ALL > >VALUES("bbb") > >>> ), > >>> ...> src2( source2 ) AS ( VALUES("X1") UNION ALL > >VALUES("X2") > >>> UNION ALL > >>> ...> VALUES("X3") UNION ALL VALUES("X4") ), > >>> ...> cnt( ts, value) AS ( > >>> ...> VALUES( 0, "ZZZZ") > >>> ...> UNION ALL > >>> ...> SELECT ts+1, value FROM cnt LIMIT 1000000) > >>> ...> > >>> ...> select src1.source1, src2.source2, cnt.* from src1, > >src2, cnt; > >>> Run Time: real 8.920 user 8.843750 sys 0.078125 > >>> sqlite> > >>> sqlite> analyze; > >>> Run Time: real 1.285 user 1.281250 sys 0.000000 > >>> sqlite> .eqp full > >>> sqlite> > >>> sqlite> select source1, > >>> ...> source2, > >>> ...> ( > >>> ...> select min(ts) > >>> ...> from rolling > >>> ...> where source1 = x.source1 > >>> ...> and source2 = x.source2 > >>> ...> ) > >>> ...> from ( > >>> ...> select distinct source1, > >>> ...> source2 > >>> ...> from rolling > >>> ...> ) as x; > >>> QUERY PLAN > >>> |--CO-ROUTINE 2 > >>> | `--SCAN TABLE rolling USING COVERING INDEX sources (~7864320 > >rows) > >>> |--SCAN SUBQUERY 2 AS x (~7864320 rows) > >>> `--CORRELATED SCALAR SUBQUERY 1 > >>> `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=? > >AND > >>> source2=?) (~983040 rows) > >>> addr opcode p1 p2 p3 p4 p5 comment > >>> ---- ------------- ---- ---- ---- ------------- -- -------- > >----- > >>> 0 Init 0 64 0 00 Start at > >64 > >>> 1 InitCoroutine 1 23 2 00 x > >>> 2 Null 1 4 0 08 > >r[4]=NULL > >>> 3 OpenRead 4 3 0 k(4,,,,) 00 root=3 > >iDb=0; > >>> sources > >>> 4 ColumnsUsed 4 0 0 3 00 > >>> 5 Explain 5 0 0 SCAN TABLE rolling USING > >COVERING > >>> INDEX sources (~7864320 rows) 00 > >>> 6 Noop 0 0 0 00 Begin > >>> WHERE-loop0: rolling > >>> 7 Rewind 4 21 2 0 00 > >>> 8 Noop 0 0 0 00 > >Begin > >>> WHERE-core > >>> 9 Column 4 0 2 00 > >>> r[2]=rolling.source1 > >>> 10 Column 4 1 3 00 > >>> r[3]=rolling.source2 > >>> 11 Ne 2 13 4 (BINARY) 80 if > >>> r[4]!=r[2] goto 13 > >>> 12 Eq 3 20 5 (BINARY) 80 if > >>> r[5]==r[3] goto 20 > >>> 13 Copy 2 4 1 00 > >>> r[4..5]=r[2..3] > >>> 14 Yield 1 0 0 00 > >>> 15 Noop 0 0 0 00 End > >>> WHERE-core > >>> 16 Column 4 0 6 00 > >r[6]= > >>> 17 Column 4 1 7 00 > >r[7]= > >>> 18 SeekGT 4 21 6 2 00 > >key=r[6..7] > >>> 19 Goto 1 8 0 00 > >>> 20 Next 4 8 0 01 > >>> 21 Noop 0 0 0 00 End > >>> WHERE-loop0: rolling > >>> 22 EndCoroutine 1 0 0 00 > >>> 23 Explain 23 0 0 SCAN SUBQUERY 2 AS x > >(~7864320 > >>> rows) 00 > >>> 24 Noop 0 0 0 00 Begin > >>> WHERE-loop0: x > >>> 25 InitCoroutine 1 0 2 00 > >>> 26 Yield 1 62 0 00 next > >row of x > >>> 27 Noop 0 0 0 00 Begin > >>> WHERE-core > >>> 28 Copy 2 9 0 00 > >r[9]=r[2]; > >>> x.source1 > >>> 29 Copy 3 10 0 00 > >r[10]=r[3]; > >>> x.source2 > >>> 30 Null 0 12 12 00 > >>> r[12..12]=NULL; Init subquery result > >>> 31 Integer 1 13 0 00 > >r[13]=1; > >>> LIMIT counter > >>> 32 Null 0 14 15 00 > >r[14..15]=NULL > >>> 33 OpenRead 5 3 0 k(4,,,,) 00 root=3 > >iDb=0; > >>> sources > >>> 34 ColumnsUsed 5 0 0 7 00 > >>> 35 Explain 35 0 0 SEARCH TABLE rolling > >USING > >>> COVERING INDEX sources (source1=? AND source2=?) (~983040 rows) > >00 > >>> 36 Noop 0 0 0 00 Begin > >>> WHERE-loop0: rolling > >>> 37 Copy 2 16 0 00 > >r[16]=r[2]; > >>> x.source1 > >>> 38 Copy 3 17 0 00 > >r[17]=r[3]; > >>> x.source2 > >>> 39 CursorHint 5 0 0 > >AND(EQ(c0,r[16]),EQ(c1,r[17])) > >>> 00 > >>> 40 Copy 2 18 0 00 > >r[18]=r[2]; > >>> x.source1 > >>> 41 IsNull 18 54 0 00 if > >>> r[18]==NULL goto 54 > >>> 42 Copy 3 19 0 00 > >r[19]=r[3]; > >>> x.source2 > >>> 43 IsNull 19 54 0 00 if > >>> r[19]==NULL goto 54 > >>> 44 Null 0 20 0 00 > >r[20]=NULL > >>> 45 SeekGT 5 54 18 3 00 > >key=r[18..20] > >>> 46 IdxGT 5 54 18 2 00 > >>> key=r[18..19] > >>> 47 Noop 0 0 0 00 > >Begin > >>> WHERE-core > >>> 48 Column 5 2 21 00 > >>> r[21]=rolling.ts > >>> 49 CollSeq 0 0 0 (BINARY) 00 > >>> 50 AggStep 0 21 14 min(1) 01 > >accum=r[14] > >>> step(r[21]) > >>> 51 Goto 0 55 0 00 > >min() by > >>> index > >>> 52 Noop 0 0 0 00 End > >>> WHERE-core > >>> 53 Next 5 46 0 00 > >>> 54 Noop 0 0 0 00 End > >>> WHERE-loop0: rolling > >>> 55 AggFinal 14 1 0 min(1) 00 > >accum=r[14] > >>> N=1 > >>> 56 Copy 14 12 0 00 > >r[12]=r[14] > >>> 57 DecrJumpZero 13 58 0 00 if > >>> (--r[13])==0 goto 58 > >>> 58 Copy 12 11 0 00 > >r[11]=r[12] > >>> 59 ResultRow 9 3 0 00 > >>> output=r[9..11] > >>> 60 Noop 0 0 0 00 End > >WHERE-core > >>> 61 Goto 0 26 0 00 > >>> 62 Noop 0 0 0 00 End > >>> WHERE-loop0: x > >>> 63 Halt 0 0 0 00 > >>> 64 Transaction 0 0 3 0 01 > >>> usesStmtJournal=0 > >>> 65 Goto 0 1 0 00 > >>> aaa|X1|0 > >>> aaa|X2|0 > >>> aaa|X3|0 > >>> aaa|X4|0 > >>> bbb|X1|0 > >>> bbb|X2|0 > >>> bbb|X3|0 > >>> bbb|X4|0 > >>> Run Time: real 0.134 user 0.000000 sys 0.000000 > >>> sqlite> > >>> > >>> --- > >>> The fact that there's a Highway to Hell but only a Stairway to > >Heaven > >>> says a lot about anticipated traffic volume. > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> sqlite-users mailing list > >>> [email protected] > >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite- > >users > >>> > >> > >_______________________________________________ > >sqlite-users mailing list > >[email protected] > >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > _______________________________________________ > sqlite-users mailing list > [email protected] > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list [email protected] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

