Yes, I can see that there is no point in SQLite having this optimization. Of course I can do this optimization myself (just run select count(*) from table, instead of using distinct) if I know all conditions are met. This wasn't anything practical I needed, just a fleeting thought.
RBS On Fri, Oct 2, 2015 at 6:14 PM, R.Smith <rsmith at rsweb.co.za> wrote: > > > On 2015-10-02 05:41 PM, Bart Smissaert wrote: > >> you're just throwing random terms around and hoping something sticks. >>> >> Not sure where you got that idea from, but let me explain better: >> >> Say we have a table: >> >> CREATE TABLE TABLE1 >> ([FIELD1] INTEGER, >> [FIELD2] TEXT, >> [FIELD3] TEXT, >> [FIELD4] REAL) >> >> and we have a unique index on all 4 fields: >> >> CREATE UNIQUE INDEX IDX1 >> ON TABLE1 >> (FIELD1, FIELD2, FIELD3, FIELD4) >> >> Now I want to count all the unique rows of this table (without any >> limiting >> where or join or whatever). >> >> Unless I am mistaken here this is done with a SQL like this: >> >> SELECT COUNT(*) AS UNIQUE_ROWS >> FROM (SELECT DISTINCT * FROM TABLE1) >> >> But if we take advantage of the fact that this table has a unique index on >> all the fields of the table >> we can simply do this SQL: >> >> SELECT COUNT(*) FROM TABLE1 >> >> It will have the same result and that is a lot faster. >> Hope this makes it clear. >> > > This explanation is much more clear - thanks. I think the initial source > of confusion is the original explanation being hard to follow, but now that > it is clear and with SQL, we can see exactly what you mean. > > In the case where where the table has rows that are not unique, there is > no gain. > In the case where we could use the optimization, all the following has to > be true: > > A - The Table must NOT have a Primary Key, but > B - The table must have a complete covering Index and > C - The covering Index must be UNIQUE and > D - All fields must be NOT NULL > E - The Query must be a COUNT() aggregate query > F - There must NOT be any JOINs > G - There must NOT be a WHERE clause. > H - There must be a correlated sub-query which accesses the table, > I - The sub-query MUST force a DISTINCT Query output constraint, and > J - The constraint MUST be on the * wildcard identifier (or alternatively, > the complete field list). > > The only way this optimization can ever be accommodated is if ALL of the > above are true, and the odds of having all those things true are even less > than Lotto-winning odds. > > One might look at the sub-query alone and say "What if the sub query (or > any query) that Uses 'DISTINCT *' in a table that has a Unique covering > index without a where clause or joins can be optimized" - which it probably > can, but the odds of this are only very slightly less low than the first > case. > > Add to that the code that needs to be added to check for all those things > - it would probably make the query planner slower by a (very) insignificant > amount, but more still than the odds of running into such a query+schema > combination. > > That said, if you do have this situation sometimes and it does slow down > things for you - Perhaps a design change in the schemata would help? > > > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >