I love email lists like this when people ask questions that make me want to 
test stuff and confirm my hypotheses.  Gives me some much needed practice...
 
Dropping your index and recreating every time you add rows is NOT the best 
thing to do.  Quite obviously your time to create an index is directly related 
to how many rows there are.  So if you add 1000 rows to a 1000000 row database 
it's going to take a LOT longer to rebuild the index for the entire thing than 
for just 1000 rows.
 
And...as you add more and more sets the time to build the index keeps increase 
linearly (or so) with the total number of rows in you database....as opposed to 
just taking the time needed for the additional rows would be approximately 
linear with the number of rows being inserted.
 
So for the below compiled with MS 32-bit compiler and 3.6.23.1
Insert 1,000,000 rows with index created before:
18 seconds
Add another 1,000,000 rows
22 seconds
Add another 1,000,000 rows
21 seconds
Add another 1,000,000 rows
23 seconds
Add another 1,000,000 rows
23 seconds
 
 
Insert 1,000,000 rows with index dropped and created after
16 seconds
Add another 1,000,000 rows the same way
20 seconds
Add another 1,000,000 rows the same way
22 seconds
Add another 1,000,000 rows the same way
24 seconds
Note that the first two runs are a bit faster...but then we catch up to just 
keeping the index around and actually take longer on the 4th run.
 
If you run this code with any argument you'll get the "drop/recreate" behavior.
 
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include "sqlite3.h"
void makeindex(sqlite3 *db)
{
 int rc;
 char *errmsg=NULL;
 rc=sqlite3_exec(db, "CREATE index i1 on t(c1)",NULL,NULL,&errmsg);
 if (rc != SQLITE_OK) {
  puts(errmsg);
  sqlite3_free(errmsg);
 }
 rc=sqlite3_exec(db, "CREATE index i2 on t(c2)",NULL,NULL,&errmsg);
 if (rc != SQLITE_OK) {
  puts(errmsg);
  sqlite3_free(errmsg);
 }
}
int main(int argc,char *argv[])
{
 sqlite3 *db;
 char *errmsg=NULL;
 int rc;
 int i;
 int indexbefore=0;
 time_t t1=time(NULL);
 if (argc > 1) indexbefore=1;
 puts(sqlite3_libversion());
 sqlite3_open("large.db",&db);
 rc=sqlite3_exec(db, "PRAGMA cache_size=100000",NULL,NULL,&errmsg);
 rc=sqlite3_exec(db, "PRAGMA synchronous=OFF",NULL,NULL,&errmsg);
 if (rc != SQLITE_OK) {
  puts(errmsg);
  sqlite3_free(errmsg);
 }
 rc=sqlite3_exec(db, "CREATE TABLE t (c1 integer, c2 integer, c3 integer, c4 
integer, c5 integer, c6 integer)",NULL,NULL,&errmsg);
 if (rc != SQLITE_OK) {
  puts(errmsg);
  sqlite3_free(errmsg);
 }
 if (indexbefore) {
  makeindex(db);
 }
 else {
  rc=sqlite3_exec(db, "DROP index t1",NULL,NULL,&errmsg);
  rc=sqlite3_exec(db, "DROP index t2",NULL,NULL,&errmsg);
 }
 sqlite3_exec(db,"BEGIN",NULL,NULL,&errmsg);
 for(i=0;i<1000000;i++) {
  int j=i*10;
  char sql[4096];
  sprintf(sql,"INSERT INTO t VALUES (%d,%d,%d,%d,%d,%d)",i,j);
  rc=sqlite3_exec(db, sql,NULL,NULL,&errmsg);
  if (rc != SQLITE_OK) {
   puts(errmsg);
   sqlite3_free(errmsg);
   exit(-1);
  }
 }
 if (!indexbefore) {
  makeindex(db);
 }
 sqlite3_exec(db,"COMMIT",NULL,NULL,&errmsg);
 printf("%d seconds\n",time(NULL)-t1);
 sqlite3_close(db);
 return 0;
}

 
Michael D. Black
Senior Scientist
Advanced Analytics Directorate
Northrop Grumman Information Systems
 

________________________________

From: sqlite-users-boun...@sqlite.org on behalf of Paul Sanderson
Sent: Tue 8/3/2010 5:24 PM
To: sqlite-users@sqlite.org
Subject: EXTERNAL:[sqlite] Inserting a large amount of data into a large 
indexed table



Hi

I have a table that contains 6 columns of integers, 2 of these columns
are indexed. There are about 10 Million rows of data in the table.
every now and again I need to add more rows, between about a thousand
and a million at a time. I want the process to be as quiick as
possible (although I know it will take minutes). The process at the
moment is to drop the indexes, add the new rows and then reindex.

Is this the best/fastest way of achieving this - is there a faster way?

P
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to