On 08.12.2017 12:40, Evgeny Kotkov wrote: > Bert Huijben <b...@qqmail.nl> writes: > >>> The recent SQLite versions (starting from 3.8.2, released in December 2013) >>> feature a `WITHOUT ROWID` optimization [1] that can be enabled when >>> creating a table. In short, it works well for tables that have non-integer >>> primary keys, such as >>> >>> name TEXT PRIMARY KEY >>> >>> by not maintaining the hidden rowid values and an another B-Tree to match >>> between a primary key value and its rowid. This reduces the on-disk size >>> and makes the lookups faster (a key → rowid → data lookup is replaced with >>> a key → data lookup). >> It doesn't add another B-tree for the primary key and its rowids. For the >> primary key the main table is used as the index. >> >> The case where things differ is when there are multiple indexes. In this >> case normal table will always refer to the primary key using the rowed, >> while for 'WITHOUT ROWID' there will be referred to the primary key, >> which in general is larger. > For the sake of finding out the truth :), I think that this contradicts the > explanation in https://sqlite.org/withoutrowid.html : > > CREATE TABLE IF NOT EXISTS wordcount( > word TEXT PRIMARY KEY, > cnt INTEGER > ); > > As an ordinary SQLite table, "wordcount" is implemented as two separate > B-Trees. The main table uses the hidden rowid value as the key and stores > the "word" and "cnt" columns as data. The "TEXT PRIMARY KEY" phrase of > the CREATE TABLE statement causes the creation of an unique index on > the "word" column. This index is a separate B-Tree that uses "word" and > the "rowid" as the key and stores no data at all. Note that the > complete text of every "word" is stored twice: once in the main > table and again in the index. > > Although I didn't check if the most recent SQLite version still behaves in > the described way internally, I have witnessed the described close-to-2x > reduction in the size of rep-cache.db — which, unless I am missing > something, follows the described idea of this optimization.
I suspect most of the performance improvement is a consequence of the reduced database size, since it needs less page-in operations for the same number of lookups. The B-tree search should really be trivial compared to uncached I/O. -- Brane