i have a simple join table containing two ids from two other tables.  i have an 
index on each of the ids in the join table.

CREATE TABLE ContentWordItem (word_id INT, item_id INT);

CREATE INDEX idx_ContentWordItem_word ON ContentWordItem(word_id);      // 
index to perform fast queries by word_id

CREATE INDEX idx_ContentWordItem_item ON ContentWordItem(item_id);      // 
index to perform fast deletes by item_id

i have a large amount of data to insert into this join table on a slow embedded 
device.  i need to avoid locking this join table for more than a second or two 
at a time so that i can make queries on this table.

so here's the question:  how do i insert small chunks of data into this table 
w/o taking a hit each time i commit?

what i'm doing is:

*         read a chunk of data from flat data file into vector of id pairs

*         begin transaction

*         loop thru vector of id pairs binding and inserting

*         commit transaction

*         repeat until data is exhausted

i'm seeing that the reading, binding, and inserting is very fast (300 ms) but 
the commit is taking upwards of 3 seconds.  when i increase my chunk size by a 
factor of 10 the insert doesn't appear to take 10x longer but the commit still 
takes upwards of 3 seconds.  the point is that the commit hit appears to be 
much greater than the insert hit but doesn't appear to scale directly.

it appears that the commit is updating the indexes and taking a long time.  is 
this a correct evaluation?

it also appears that the commit takes longer as the size of the table grows 
(i.e. the index is getting bigger).  is this expected?

what i'm worried about is that by reducing the chunk size (to avoid locking the 
db for a long time) i add a significant amount of time to the insert process 
because the commits are costing several seconds.  however, locking the db for a 
long time is not desirable.

i'm also concerned about the commit time increasing over time as the amount of 
data in the table increases.

is there a better approach?

thanks
tom

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to