--- Andrew McCollum <[EMAIL PROTECTED]> wrote:
> I am building a very large (> 500M rows, ~6 GB) sqlite database that has
> three integer columns.  I find that inserting all the rows takes only a
> couple hours, but when I try to create an index on any of the columns the
> process will run for days without finishing.
> 
> If the entire operation can be done in RAM, then it is possible for it to
> finish in a few hours, but the second the sqlite process has to go to disk,
> it grinds to a slow crawl.
> 
> Is it possible to split the index operation into multiple phases?  Or better
> yet, flush the data when the cache is full and continue on?  Any other ideas
> on how to speed up the indexing other than reducing the size of the data so
> that the operation can be done in RAM?  ACID restrictions are unimportant
> during this construction phase of the database.

Create the index prior to the data being populated and then insert
a few thousand rows per transaction. Experiment with different sized 
batches. It will run much faster if the rows to be inserted are
pre-sorted in index order.

However, the problem with this incremental approach is that the 
database pages of the index will be scattered across the 6 Gig file 
leading to poor performance if the file is not pre-cached in RAM 
by the OS.


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to