Jay A.  Kreibich wrote: 

> Yes.  Hence the "and this is the important part" comment.  Most of 
> the time when people are building billion-row files, they're building 
> a new DB by importing a static source of data.  If things go wrong, 
> you just throw out the database and try again.  

That's kinda like doing it all in a single big transaction, which I 
wanted to avoid.  :) 

But my take-away from this is conversation is generally "you're SOL": 
we are backed by a tree, so we have to update existing nodes to point 
to the new ones, so we have to touch existing pages on INSERT.  Is that 
about right?  

It's not in any way a result of my schema?  My primary key is a pair of
integers A,B.  The first column in this particular use case is in the 
range A = [0, 2million) and the second is in the range B = [0, infinity).
We

insert records A=0->2million, B=0, then
insert records A=0->2million, B=1,

etc.

Could this have an impact on how many pages need to be touched on
INSERT?

> It would also help to bump the cache up...  

That works great until the db size blows through the total RAM on the 
system, at which point we're of course disk-bound again.  At the moment
I'm only inserting about 4k rows/second. :/

Eric

-- 
Eric A. Smith

I have always wished that my computer would be as easy to use as 
my telephone. My wish has come true. I no longer know how to use 
my telephone.
    -- Bjarne Stroustrup
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to