Hi, it's me again, struggling with indices once again. What I'm now trying to do is filter on source1 and by range of timestamp. Results should be naturally ordered by source1, source2,ts.
1) SELECT source1, source2, ts, value FROM rolling WHERE source1 = 'aaa' AND ts > 1 AND ts < 10; QUERY PLAN `--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND ANY(source2) AND ts>? AND ts<?) This looks pretty OK on the sample dataset. For some reason though this doesn't really work like that with the real dataset (which has an extra index on ts only), and that seems to kick in instead of the real index. So the query plan comes out completely different, and rows are not naturally sorted by source1, source2, as I'd like. [I know I can use "+ts" and/or drop the "ts" index altogether, but I'm trying to make a point here...] So I add an extra ORDER BY clause: 2) SELECT source1, source2, ts, value FROM rolling WHERE source1 = 'aaa' AND ts > 1 AND ts < 100000000 ORDER BY source1, source2, ts; QUERY PLAN |--SEARCH TABLE rolling USING INDEX sources (ANY(source1) AND ANY(source2) AND ts>? AND ts<?) `--USE TEMP B-TREE FOR RIGHT PART OF ORDER BY Here I don't really understand why a TEMP B-TREE is required at all. Apparently, this adds extra processing at the end so I don't start pulling any rows until much later. If the index is used, data would be *naturally* sorted by its key, so I don't really understand. So I recur to a subquery: 3) SELECT source1, source2, ts, value FROM rolling WHERE ((source1, source2) in (SELECT DISTINCT source1, source2 from rolling where source1='aaa') AND ts > 1 AND ts < 10) QUERY PLAN |--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=? AND ts>? AND ts<?) `--LIST SUBQUERY `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?) This also seems to yield the right results, in the right order. But the order looks pretty much like a coincidence to me. So I'd rather explicitly state the sorting I'd like: 4) SELECT source1, source2, ts, value FROM rolling WHERE ((source1, source2) in (SELECT DISTINCT source1, source2 from rolling where source1='aaa') AND ts > 1 AND ts < 10) ORDER BY source1, source2, ts QUERY PLAN |--SEARCH TABLE rolling USING INDEX sources (source1=? AND source2=? AND ts>? AND ts<?) |--LIST SUBQUERY | `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=?) `--USE TEMP B-TREE FOR ORDER BY Here the b-tree is *NOT* for "RIGHT PART OF" ORDER BY, it seems like some global sorting which looks even worse than case 2. What am I doing wrong? Is this expected? I just don't seem to be able to get what would have been a pretty trivial task with last-century technologies (thinking of Paradox, dBase, CA-Clipper)... Thanks in advance! Gerlando On Mon, Jan 28, 2019 at 9:11 AM Gerlando Falauto <gerlando.fala...@gmail.com> wrote: > YES! Thank you! > Many thanks for the ".eqp full" tip also, that really explains a lot > (though I don't really understand any of it yet). > > Have a great day! > Gerlando > > > On Mon, Jan 28, 2019 at 6:50 AM Keith Medcalf <kmedc...@dessus.com> wrote: > >> >> Do you perhaps want this: >> >> select source1, >> source2, >> ( >> select min(ts) >> from rolling >> where source1 = x.source1 >> and source2 = x.source2 >> ) >> from ( >> select distinct source1, >> source2 >> from rolling >> ) as x; >> >> SQLite version 3.27.0 2019-01-28 00:42:06 >> Enter ".help" for usage hints. >> Connected to a transient in-memory database. >> Use ".open FILENAME" to reopen on a persistent database. >> sqlite> .timer on >> sqlite> CREATE TABLE `rolling` ( >> ...> `source1` TEXT NOT NULL, >> ...> `source2` TEXT NOT NULL, >> ...> `ts` INTEGER NOT NULL, >> ...> `value` TEXT >> ...> ); >> Run Time: real 0.002 user 0.000000 sys 0.000000 >> sqlite> >> sqlite> CREATE INDEX `sources` ON `rolling` ( >> ...> `source1`, >> ...> `source2`, >> ...> `ts` >> ...> ); >> Run Time: real 0.001 user 0.000000 sys 0.000000 >> sqlite> >> sqlite> INSERT INTO rolling >> ...> WITH RECURSIVE >> ...> src1( source1 ) AS ( VALUES("aaa") UNION ALL VALUES("bbb") >> ), >> ...> src2( source2 ) AS ( VALUES("X1") UNION ALL VALUES("X2") >> UNION ALL >> ...> VALUES("X3") UNION ALL VALUES("X4") ), >> ...> cnt( ts, value) AS ( >> ...> VALUES( 0, "ZZZZ") >> ...> UNION ALL >> ...> SELECT ts+1, value FROM cnt LIMIT 1000000) >> ...> >> ...> select src1.source1, src2.source2, cnt.* from src1, src2, cnt; >> Run Time: real 8.920 user 8.843750 sys 0.078125 >> sqlite> >> sqlite> analyze; >> Run Time: real 1.285 user 1.281250 sys 0.000000 >> sqlite> .eqp full >> sqlite> >> sqlite> select source1, >> ...> source2, >> ...> ( >> ...> select min(ts) >> ...> from rolling >> ...> where source1 = x.source1 >> ...> and source2 = x.source2 >> ...> ) >> ...> from ( >> ...> select distinct source1, >> ...> source2 >> ...> from rolling >> ...> ) as x; >> QUERY PLAN >> |--CO-ROUTINE 2 >> | `--SCAN TABLE rolling USING COVERING INDEX sources (~7864320 rows) >> |--SCAN SUBQUERY 2 AS x (~7864320 rows) >> `--CORRELATED SCALAR SUBQUERY 1 >> `--SEARCH TABLE rolling USING COVERING INDEX sources (source1=? AND >> source2=?) (~983040 rows) >> addr opcode p1 p2 p3 p4 p5 comment >> ---- ------------- ---- ---- ---- ------------- -- ------------- >> 0 Init 0 64 0 00 Start at 64 >> 1 InitCoroutine 1 23 2 00 x >> 2 Null 1 4 0 08 r[4]=NULL >> 3 OpenRead 4 3 0 k(4,,,,) 00 root=3 iDb=0; >> sources >> 4 ColumnsUsed 4 0 0 3 00 >> 5 Explain 5 0 0 SCAN TABLE rolling USING COVERING >> INDEX sources (~7864320 rows) 00 >> 6 Noop 0 0 0 00 Begin >> WHERE-loop0: rolling >> 7 Rewind 4 21 2 0 00 >> 8 Noop 0 0 0 00 Begin >> WHERE-core >> 9 Column 4 0 2 00 >> r[2]=rolling.source1 >> 10 Column 4 1 3 00 >> r[3]=rolling.source2 >> 11 Ne 2 13 4 (BINARY) 80 if >> r[4]!=r[2] goto 13 >> 12 Eq 3 20 5 (BINARY) 80 if >> r[5]==r[3] goto 20 >> 13 Copy 2 4 1 00 >> r[4..5]=r[2..3] >> 14 Yield 1 0 0 00 >> 15 Noop 0 0 0 00 End >> WHERE-core >> 16 Column 4 0 6 00 r[6]= >> 17 Column 4 1 7 00 r[7]= >> 18 SeekGT 4 21 6 2 00 key=r[6..7] >> 19 Goto 1 8 0 00 >> 20 Next 4 8 0 01 >> 21 Noop 0 0 0 00 End >> WHERE-loop0: rolling >> 22 EndCoroutine 1 0 0 00 >> 23 Explain 23 0 0 SCAN SUBQUERY 2 AS x (~7864320 >> rows) 00 >> 24 Noop 0 0 0 00 Begin >> WHERE-loop0: x >> 25 InitCoroutine 1 0 2 00 >> 26 Yield 1 62 0 00 next row of x >> 27 Noop 0 0 0 00 Begin >> WHERE-core >> 28 Copy 2 9 0 00 r[9]=r[2]; >> x.source1 >> 29 Copy 3 10 0 00 r[10]=r[3]; >> x.source2 >> 30 Null 0 12 12 00 >> r[12..12]=NULL; Init subquery result >> 31 Integer 1 13 0 00 r[13]=1; >> LIMIT counter >> 32 Null 0 14 15 00 r[14..15]=NULL >> 33 OpenRead 5 3 0 k(4,,,,) 00 root=3 iDb=0; >> sources >> 34 ColumnsUsed 5 0 0 7 00 >> 35 Explain 35 0 0 SEARCH TABLE rolling USING >> COVERING INDEX sources (source1=? AND source2=?) (~983040 rows) 00 >> 36 Noop 0 0 0 00 Begin >> WHERE-loop0: rolling >> 37 Copy 2 16 0 00 r[16]=r[2]; >> x.source1 >> 38 Copy 3 17 0 00 r[17]=r[3]; >> x.source2 >> 39 CursorHint 5 0 0 AND(EQ(c0,r[16]),EQ(c1,r[17])) >> 00 >> 40 Copy 2 18 0 00 r[18]=r[2]; >> x.source1 >> 41 IsNull 18 54 0 00 if >> r[18]==NULL goto 54 >> 42 Copy 3 19 0 00 r[19]=r[3]; >> x.source2 >> 43 IsNull 19 54 0 00 if >> r[19]==NULL goto 54 >> 44 Null 0 20 0 00 r[20]=NULL >> 45 SeekGT 5 54 18 3 00 key=r[18..20] >> 46 IdxGT 5 54 18 2 00 >> key=r[18..19] >> 47 Noop 0 0 0 00 Begin >> WHERE-core >> 48 Column 5 2 21 00 >> r[21]=rolling.ts >> 49 CollSeq 0 0 0 (BINARY) 00 >> 50 AggStep 0 21 14 min(1) 01 accum=r[14] >> step(r[21]) >> 51 Goto 0 55 0 00 min() by >> index >> 52 Noop 0 0 0 00 End >> WHERE-core >> 53 Next 5 46 0 00 >> 54 Noop 0 0 0 00 End >> WHERE-loop0: rolling >> 55 AggFinal 14 1 0 min(1) 00 accum=r[14] >> N=1 >> 56 Copy 14 12 0 00 r[12]=r[14] >> 57 DecrJumpZero 13 58 0 00 if >> (--r[13])==0 goto 58 >> 58 Copy 12 11 0 00 r[11]=r[12] >> 59 ResultRow 9 3 0 00 >> output=r[9..11] >> 60 Noop 0 0 0 00 End WHERE-core >> 61 Goto 0 26 0 00 >> 62 Noop 0 0 0 00 End >> WHERE-loop0: x >> 63 Halt 0 0 0 00 >> 64 Transaction 0 0 3 0 01 >> usesStmtJournal=0 >> 65 Goto 0 1 0 00 >> aaa|X1|0 >> aaa|X2|0 >> aaa|X3|0 >> aaa|X4|0 >> bbb|X1|0 >> bbb|X2|0 >> bbb|X3|0 >> bbb|X4|0 >> Run Time: real 0.134 user 0.000000 sys 0.000000 >> sqlite> >> >> --- >> The fact that there's a Highway to Hell but only a Stairway to Heaven >> says a lot about anticipated traffic volume. >> >> >> >> >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users