Brannon King wrote:
Or is an Index sufficient?

Brandon,

Having a primary key is the same thing as adding a UNIQUE index on the desired columns except for the case where the key is on a single column with integer type (i.e. integer primary key). The later case is optimized in sqlite, but must be done before the table is populated.

I have a table that looks like this:

CREATE TABLE IF NOT EXISTS results (qi INTEGER NOT NULL, ri INTEGER NOT NULL, run INTEGER NOT NULL, drl INTEGER NOT NULL, score INTEGER NOT NULL, qis INTEGER NOT NULL, ris INTEGER NOT NULL );

I originally had this line as well: , PRIMARY KEY (qi, ri, run)

I need to do billions (no joke) of consecutive inserts into this table. Writing binary out with std::ofstream was using 26% of my code execution time according to the profiler. I changed to sqlite3, and now the sqlite3_step function uses 50% of my execution time. After reading the news groups, I removed the primary key. That dropped it down to about 41%. That was significant. However, I was still significantly slower than binary writes with ofstream. Then, I tried the PRAGMA temp_store = 2. That made absolutely no difference. I'll assume that's what it was to begin with, though it reports 0. Also, from the profiler, it seems that the step function does a lot of memory allocations and deallocations, yet I cannot find them in the code.

If it helps, I was testing 600k inserts in transactions of 0xFFF inserts and my current settings:
                               PRAGMA auto_vacuum = 0; \
                               PRAGMA case_sensitive_like = 1; \
                               PRAGMA cache_size = 32768; \
                               PRAGMA default_cache_size = 32768; \
                               PRAGMA count_changes = 0; \
                               PRAGMA synchronous = 0; \
                               PRAGMA page_size = 4096; \
                               PRAGMA temp_store = 2;

What else can I do to speed up my inserts?


There are a couple of things to look at. First, you don't say exactly how you are executing the inserts. You should use sqlite3_prepare() to compile an insert statement with a your seven columns as parameters.

   insert into results values (?, ?, ?, ?, ?, ?, ?)

Then inside your insert loop you should bind values to each of these parameters. If you have known patterns in your input data, and some of the parameter values repeat, you can eliminate re-binding the same value to a parameter for each loop since sqlite will retain the last bound value. Finally, step the statement and then reset it for the next loop iteration.

You should add your index after you have inserted all the data.

You can also speed the execution time of the insert statement somewhat by dropping the NOT NULL constraints on your data columns. Each one of those is checked for every insert. For comparison here is a dump of the code generated for a simple two column table with and without the not null constraints.

   SQLite version 3.3.5
   Enter ".help" for instructions
   sqlite> create table t1 (a integer, b integer);
   sqlite> create table t2 (a integer not null, b integer not null);
   sqlite> .explain on
   sqlite> explain insert into t1 values (?, ?);
   addr  opcode          p1          p2          p3
---- -------------- ---------- ---------- ---------------------------------
   0     Goto            0           11
   1     Integer         0           0
   2     OpenWrite       0           2
   3     SetNumColumns   0           2
   4     NewRowid        0           0
   5     Variable        1           0
   6     Variable        2           0
   7     MakeRecord      2           0           dd
   8     Insert          0           3           t1
   9     Close           0           0
   10    Halt            0           0
   11    Transaction     0           1
   12    VerifyCookie    0           2
   13    Goto            0           1
   14    Noop            0           0
   sqlite> explain insert into t2 values (?, ?);
   addr  opcode          p1          p2          p3
---- -------------- ---------- ---------- ---------------------------------
   0     Goto            0           17
   1     Integer         0           0
   2     OpenWrite       0           3
   3     SetNumColumns   0           2
   4     NewRowid        0           0
   5     Variable        1           0
   6     Variable        2           0
   7     Dup             1           1
   8     NotNull         1           10
   9     Halt            19          2           t2.a may not be NULL
   10    Dup             0           1
   11    NotNull         1           13
   12    Halt            19          2           t2.b may not be NULL
   13    MakeRecord      2           0           dd
   14    Insert          0           3           t2
   15    Close           0           0
   16    Halt            0           0
   17    Transaction     0           1
   18    VerifyCookie    0           2
   19    Goto            0           1
   20    Noop            0           0
   sqlite>

Dropping the not null constraints makes perfect sense if you are always using sqlite3_bind_int() to assign the column values, since there is no way for your application to insert a null value (i.e. you need to call sqlite3_bind_null() to insert a null value).

HTH
Dennis Cote

Reply via email to