Re: [sqlite] Boosting insert and indexing performance for 10 billion rows (?)

Keith Medcalf Sun, 02 Dec 2018 10:03:02 -0800

Well if it is unique and not null, then why not just make it the rowid?  In 
either case, you would still have to permute the storage tree at insert time if 
the inserts were not in-order.  So let us compare them shall we:


sqlite> create table x(value INTEGER PRIMARY KEY);
sqlite> insert into x select random() from generate_series where start=1 and 
stop=100000000;
Run Time: real 185.795 user 184.265625 sys 0.343750

sqlite> create table x(value INTEGER PRIMARY KEY) WITHOUT ROWID;
sqlite> insert into x select random() from generate_series where start=1 and 
stop=100000000;
Run Time: real 174.661 user 173.890625 sys 0.000000

sqlite> create table x(value INTEGER NOT NULL);
sqlite> insert into x select random() from generate_series where start=1 and 
stop=100000000;
Run Time: real 20.287 user 20.265625 sys 0.000000
sqlite> create unique index ix on x(value);
Run Time: real 211.556 user 207.562500 sys 2.562500

sqlite> create table x(value INTEGER NOT NULL);
sqlite> create unique index ix on x(value);
sqlite> insert into x select random() from generate_series where start=1 and 
stop=100000000;
Run Time: real 196.719 user 195.437500 sys 0.000000

So, the winner is (in this case, with no other payload) that the INTEGER 
PRIMARY KEY on a WITHOUT ROWID table is the most time efficient, followed by 
being the INTEGER PRIMARY KEY of a ROWID table.  Those two also happen to be 
the most space-efficient as well.  Interestingly it was overall faster to build 
and maintain the separate index at insert time in this example than to build 
the index seperately, but not by much.  Note this is for 100,000,000 records 
processed entitely in RAM in a single transaction ... 

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[email protected]] On Behalf Of E.Pasma
>Sent: Sunday, 2 December, 2018 03:38
>To: SQLite mailing list
>Subject: Re: [sqlite] Boosting insert and indexing performance for 10
>billion rows (?)
>
>
>>  30 nov. 2018, AJ Miles:
>>
>> Ah, this tool seems very handy. For those curious, I'll paste the
>results
>> below. The index approximately doubles the storage size, but I am
>> intentionally making that tradeoff to avoid the slow down when
>enforcing a
>> unique/primary key on the Reference table while inserting.
>>
>> -AJ
>
>Hello,
>
>is the "integer" column unique and a potential primary key? Then it
>would be surprising that defining it as such causes a slowdown.
>Assuming the table is then defined WITHOUT ROWID. And that the
>multithreading pragma is set.
>
>About sqlite_analyzer: on macOS 10.13.6 I got
>..Tcl.framework/Versions/8.4.. image not found
>After editing the binary file as below, it is alright.
>vi sqlite3_analyzer
>:%s/8\.4/8.5/
>:x
>
>Thanks for the intriguing subject. E. Pasma.
>_______________________________________________
>sqlite-users mailing list
>[email protected]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Boosting insert and indexing performance for 10 billion rows (?)

Reply via email to