I love email lists like this when people ask questions that make me want to
test stuff and confirm my hypotheses. Gives me some much needed practice...
Dropping your index and recreating every time you add rows is NOT the best
thing to do. Quite obviously your time to create an index is directly related
to how many rows there are. So if you add 1000 rows to a 1000000 row database
it's going to take a LOT longer to rebuild the index for the entire thing than
for just 1000 rows.
And...as you add more and more sets the time to build the index keeps increase
linearly (or so) with the total number of rows in you database....as opposed to
just taking the time needed for the additional rows would be approximately
linear with the number of rows being inserted.
So for the below compiled with MS 32-bit compiler and 3.6.23.1
Insert 1,000,000 rows with index created before:
18 seconds
Add another 1,000,000 rows
22 seconds
Add another 1,000,000 rows
21 seconds
Add another 1,000,000 rows
23 seconds
Add another 1,000,000 rows
23 seconds
Insert 1,000,000 rows with index dropped and created after
16 seconds
Add another 1,000,000 rows the same way
20 seconds
Add another 1,000,000 rows the same way
22 seconds
Add another 1,000,000 rows the same way
24 seconds
Note that the first two runs are a bit faster...but then we catch up to just
keeping the index around and actually take longer on the 4th run.
If you run this code with any argument you'll get the "drop/recreate" behavior.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "sqlite3.h"
void makeindex(sqlite3 *db)
{
int rc;
char *errmsg=NULL;
rc=sqlite3_exec(db, "CREATE index i1 on t(c1)",NULL,NULL,&errmsg);
if (rc != SQLITE_OK) {
puts(errmsg);
sqlite3_free(errmsg);
}
rc=sqlite3_exec(db, "CREATE index i2 on t(c2)",NULL,NULL,&errmsg);
if (rc != SQLITE_OK) {
puts(errmsg);
sqlite3_free(errmsg);
}
}
int main(int argc,char *argv[])
{
sqlite3 *db;
char *errmsg=NULL;
int rc;
int i;
int indexbefore=0;
time_t t1=time(NULL);
if (argc > 1) indexbefore=1;
puts(sqlite3_libversion());
sqlite3_open("large.db",&db);
rc=sqlite3_exec(db, "PRAGMA cache_size=100000",NULL,NULL,&errmsg);
rc=sqlite3_exec(db, "PRAGMA synchronous=OFF",NULL,NULL,&errmsg);
if (rc != SQLITE_OK) {
puts(errmsg);
sqlite3_free(errmsg);
}
rc=sqlite3_exec(db, "CREATE TABLE t (c1 integer, c2 integer, c3 integer, c4
integer, c5 integer, c6 integer)",NULL,NULL,&errmsg);
if (rc != SQLITE_OK) {
puts(errmsg);
sqlite3_free(errmsg);
}
if (indexbefore) {
makeindex(db);
}
else {
rc=sqlite3_exec(db, "DROP index t1",NULL,NULL,&errmsg);
rc=sqlite3_exec(db, "DROP index t2",NULL,NULL,&errmsg);
}
sqlite3_exec(db,"BEGIN",NULL,NULL,&errmsg);
for(i=0;i<1000000;i++) {
int j=i*10;
char sql[4096];
sprintf(sql,"INSERT INTO t VALUES (%d,%d,%d,%d,%d,%d)",i,j);
rc=sqlite3_exec(db, sql,NULL,NULL,&errmsg);
if (rc != SQLITE_OK) {
puts(errmsg);
sqlite3_free(errmsg);
exit(-1);
}
}
if (!indexbefore) {
makeindex(db);
}
sqlite3_exec(db,"COMMIT",NULL,NULL,&errmsg);
printf("%d seconds\n",time(NULL)-t1);
sqlite3_close(db);
return 0;
}
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
________________________________
From: [email protected] on behalf of Paul Sanderson
Sent: Tue 8/3/2010 5:24 PM
To: [email protected]
Subject: EXTERNAL:[sqlite] Inserting a large amount of data into a large
indexed table
Hi
I have a table that contains 6 columns of integers, 2 of these columns
are indexed. There are about 10 Million rows of data in the table.
every now and again I need to add more rows, between about a thousand
and a million at a time. I want the process to be as quiick as
possible (although I know it will take minutes). The process at the
moment is to drop the indexes, add the new rows and then reindex.
Is this the best/fastest way of achieving this - is there a faster way?
P
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users