[sqlite] Speed of count distinct rows if unique index on all fields

Scott Robison Fri, 2 Oct 2015 10:49:31 -0600

On Fri, Oct 2, 2015 at 9:41 AM, Bart Smissaert <bart.smissaert at gmail.com>
wrote:


> > you're just throwing random terms around and hoping something sticks.
>
> Not sure where you got that idea from, but let me explain better:
>
> Say we have a table:
>
> CREATE TABLE TABLE1
> ([FIELD1] INTEGER,
> [FIELD2] TEXT,
> [FIELD3] TEXT,
> [FIELD4] REAL)
>
> and we have a unique index on all 4 fields:
>
> CREATE UNIQUE INDEX IDX1
> ON TABLE1
> (FIELD1, FIELD2, FIELD3, FIELD4)
>
> Now I want to count all the unique rows of this table (without any limiting
> where or join or whatever).
>
> Unless I am mistaken here this is done with a SQL like this:
>
> SELECT COUNT(*) AS UNIQUE_ROWS
> FROM (SELECT DISTINCT * FROM TABLE1)
>
> But if we take advantage of the fact that this table has a unique index on
> all the fields of the table
> we can simply do this SQL:
>
> SELECT COUNT(*) FROM TABLE1
>
> It will have the same result and that is a lot faster.
> Hope this makes it clear.
>

I think you're already going to be getting the "fastest" way with your
structure.

You have a row wide unique index that ensures two duplicate rows can't be
inserted into the table.

Each row of that index is going to be essentially (if not exactly) the same
size as each row of the table, since they include all the same columns.
There is no real speed difference between reading each row of the index vs
each row of the table.

Additionally, SQLite already has an optimization in place for the query
"SELECT COUNT(*) FROM sometable". It still has to walk the entire B-tree,
but it only has to count how many rows are in each page, it doesn't have to
actually decode each row to check conditions.

Now, I am not an expert on SQLite, but based on my understanding of things,
SQLite is going as about fast as it can (without changes to the file
format), and the relative size of the table vs the index means there isn't
really any size difference to exploit.

-- 
Scott Robison

[sqlite] Speed of count distinct rows if unique index on all fields

Reply via email to