Hi,

OK, I am impressed that you can insert 2000 records/second on indexed columns. I have an application that inserts CSV data into an SQLite database, and inserting 15 million records is taking about 2 hours with no indices (this is on a PowerMac Dual G5 2.0 GHz, 1.5 GB RAM), though two fields each require a simple sub-select on to populate. My attempts to improve performance include:
        * performing all the inserts inside one transaction
        * using prepared statements
        * dropping all the indices prior to running
        * setting sync mode to Normal
        * setting cache_size to huge values, like 300,000
        * creating temp tables in memory

Do my times sound reasonable or should I continue to look for optimizations? Any other suggestions?

Thanks,

Aaron

On Sep 2, 2005, at 3:05 PM, D. Richard Hipp wrote:

On Fri, 2005-09-02 at 11:59 +0200, Michael Schoen wrote:

(1)     Multiple Insert Statements
We need to insert around 300-500 datasets/sec constantly (24/7) with 8 till 16 fields indexed. So far we are using mysql, not only due to the general dbms speed, but mainly because it has a csv import interface. We
figured out, that we can not insert that many data in row, when using
SQL statements, 'cause the SQL-parser just takes to much time... If we
use the CSV import methods, we gain at least a 30% performance boost.
While reading through the sqlite wiki, I found no evidence about any
multiple insert statement to decrease the SQL-Parsing time...
What would you guys propose?


1.  Call sqlite3_prepare() to parse a generic INSERT statement.
2.  Call sqlite3_bind() to bind values to the generic statement.
3.  Call sqlite3_step() to do the INSERT
4.  Call sqlite3_reset() to reset the statement so that it can
    run again.
5.  Go back and repeat steps 2-4 as many times as you like.

This allows you to insert many different rows while only running
the parser onces.  You do get about a 30% speed boost doing this.

Out of curiosity, I created a table with 17 columns and 8 indices
just to see how fast I could insert into it using SQLite.  On
my laptop, I could easily do 2000 inserts/second (all in a single
transaction) even when parsing each INSERT separately.  I would
expect better performance on a workstation and a 30% or better
performance improvement using the technique described above.

On the other hand,  performance will fall off logarithmically
as the size of the database increases.


(2)     Indexing:
What kind of index implementations are you guys supporting/using?
B-Tree, R-Tree, Full-Text? What about Multiple-Column Indexes?
Supported?


B-tree indices. You can have as many columns in an index as
you want.  But only the first 31 columns will be used to
help speed searches.  (On the other hand, who ever creates
an index with 32 or more columns?)



Reply via email to