Re: [PATCH] Use the `WITHOUT ROWID` SQLite optimization for rep-cache.db

Branko Čibej Fri, 08 Dec 2017 03:47:20 -0800

On 08.12.2017 12:40, Evgeny Kotkov wrote:
> Bert Huijben <b...@qqmail.nl> writes:
>
>>> The recent SQLite versions (starting from 3.8.2, released in December 2013)
>>> feature a `WITHOUT ROWID` optimization [1] that can be enabled when
>>> creating a table.  In short, it works well for tables that have non-integer
>>> primary keys, such as
>>>
>>>     name TEXT PRIMARY KEY
>>>
>>> by not maintaining the hidden rowid values and an another B-Tree to match
>>> between a primary key value and its rowid.  This reduces the on-disk size
>>> and makes the lookups faster (a key → rowid → data lookup is replaced with
>>> a key → data lookup).
>> It doesn't add another B-tree for the primary key and its rowids. For the
>> primary key the main table is used as the index.
>>
>> The case where things differ is when there are multiple indexes. In this
>> case normal table will always refer to the primary key using the rowed,
>> while for 'WITHOUT ROWID' there will be referred to the primary key,
>> which in general is larger.
> For the sake of finding out the truth :), I think that this contradicts the
> explanation in https://sqlite.org/withoutrowid.html :
>
>     CREATE TABLE IF NOT EXISTS wordcount(
>       word TEXT PRIMARY KEY,
>       cnt INTEGER
>     );
>
>     As an ordinary SQLite table, "wordcount" is implemented as two separate
>     B-Trees. The main table uses the hidden rowid value as the key and stores
>     the "word" and "cnt" columns as data. The "TEXT PRIMARY KEY" phrase of
>     the CREATE TABLE statement causes the creation of an unique index on
>     the "word" column. This index is a separate B-Tree that uses "word" and
>     the "rowid" as the key and stores no data at all. Note that the
>     complete text of every "word" is stored twice: once in the main
>     table and again in the index.
>
> Although I didn't check if the most recent SQLite version still behaves in
> the described way internally, I have witnessed the described close-to-2x
> reduction in the size of rep-cache.db — which, unless I am missing
> something, follows the described idea of this optimization.


I suspect most of the performance improvement is a consequence of the
reduced database size, since it needs less page-in operations for the
same number of lookups. The B-tree search should really be trivial
compared to uncached I/O.

-- Brane

Re: [PATCH] Use the `WITHOUT ROWID` SQLite optimization for rep-cache.db

Reply via email to