Black, Michael (IS) <michael.bla...@ngc.com> wrote:
> What does distinct do?
> sqlite> explain query plan select distinct (col1) from x where col1 is not 
> null;
> sele  order          from  deta
> ----  -------------  ----  ----
> 0     0              0     SCAN TABLE x USING COVERING INDEX col1index 
> (~500000 rows)
> OK...we're still using an index here...

In a sense, yes - but note that you have SCAN TABLE USING INDEX; compare and 
contrast with SEARCH TABLE USING INDEX.

DISTINCT effectively implies ORDER BY. The way DISTINCT is implemeted, SQLite 
enumerates the rows in order (and the index is helpful here) and discards any 
where the value is the same as in the previous row. You've just replaced a full 
table scan with a full index scan (and changed the meaning of the query in the 
process).

> so using this subselect we do this:
> sqlite> explain query plan select * from x where col1 in (select distinct 
> (col1) from x where col1 is not null);
> sele  order          from  deta
> ----  -------------  ----  ----
> 0     0              0     SEARCH TABLE x USING INDEX col1index (col1=?) 
> (~250 rows)
> 0     0              0     EXECUTE LIST SUBQUERY 1
> 1     0              0     SCAN TABLE x USING COVERING INDEX col1index 
> (~500000 rows)

So now you are scanning the table, and also searching on top of that. How is 
this an improvement?

> Performance probably depends on how many "not null" things there are...if not 
> many of them this may not be any faster.

In fact, this will always be strictly slower than a straightforward table scan.

> A count() could be a lot faster though I'd think.

How so? I'm not even sure how you would use count() here, let alone use it in a 
way that leads to performance gains.
-- 
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to