Aha ok, great! Now, forgive me, but there is still a difference in the byte code, and I'm having a hard time decyphering it. The difference from the non-distinct and the distinct is:
* The table t1 is opened * There is an extra Seek in the scanning loop, it looks like it moves the read pointer of table t1 to the indexed rowid. There are no reads occurring, but won't the seek cause I/O? See log below. sqlite> create index tb on t1(b); sqlite> select distinct b from t1; --EQP-- 0,0,0,SCAN TABLE t1 USING COVERING INDEX tb addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 11 0 00 Start at 11 1 Null 1 2 0 08 r[2]=NULL 2 OpenRead 2 4 0 k(2,,) 00 root=4 iDb=0; tb 3 Explain 0 0 0 SCAN TABLE t1 USING COVERING INDEX tb 00 4 Rewind 2 10 1 0 00 5 Column 2 0 1 00 r[1]=t1.b 6 Eq 1 9 2 (BINARY) 80 if r[2]==r[1] goto 9 7 Copy 1 2 0 00 r[2]=r[1] 8 ResultRow 1 1 0 00 output=r[1] 9 Next 2 5 0 01 10 Halt 0 0 0 00 11 Transaction 0 0 3 0 01 usesStmtJournal=0 12 Goto 0 1 0 00 Error: index tb already exists sqlite> select distinct length(b) from t1; --EQP-- 0,0,0,SCAN TABLE t1 USING INDEX t addr opcode p1 p2 p3 p4 p5 comment ---- ------------- ---- ---- ---- ------------- -- ------------- 0 Init 0 13 0 00 Start at 13 1 Null 1 2 0 08 r[2]=NULL 2 OpenRead 0 2 0 2 00 root=2 iDb=0; t1 3 OpenRead 2 3 0 k(2,,) 00 root=3 iDb=0; t 4 Explain 0 0 0 SCAN TABLE t1 USING INDEX t 00 5 Rewind 2 12 1 0 00 6 Seek 2 0 0 00 Move 0 to 2.rowid 7 Column 2 0 1 00 r[1]= 8 Eq 1 11 2 80 if r[2]==r[1] goto 11 9 Copy 1 2 0 00 r[2]=r[1] 10 ResultRow 1 1 0 00 output=r[1] 11 Next 2 6 0 01 12 Halt 0 0 0 00 13 Transaction 0 0 3 0 01 usesStmtJournal=0 14 Goto 0 1 0 00 On Fri, Aug 11, 2017 at 1:31 PM Richard Hipp <d...@sqlite.org> wrote: > On 8/11/17, Wout Mertens <wout.mert...@gmail.com> wrote: > > So, am I correct in thinking that an index on expressions already has all > > the required data to answer e.g. a SELECT DISTINCT? > > > > If so, that could be an optimization? > > > > Can I request this optimization to be made? :) > > > > That optimization is already made. Look at the byte-code: > > CREATE TABLE t1(a,b,c); > CREATE INDEX t1x1 ON t1(length(b)); > .eqp full > SELECT DISTINCT length(b) FROM t1; > > The problem is that when the "Explain" instruction (which gives the > "explain query plan" output) is generated, the query planner does not > yet realize that it can get by with only using the index. It never > actually uses the original table - it only uses the index - but at the > time that "Explain" is generated, that fact is unknown. And so the > EXPLAIN QUERY PLAN output is not 100% accurate. > > I think fixing that should be low-priority. > > -- > D. Richard Hipp > d...@sqlite.org > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users