--- Andrew McCollum <[EMAIL PROTECTED]> wrote: > I am building a very large (> 500M rows, ~6 GB) sqlite database that has > three integer columns. I find that inserting all the rows takes only a > couple hours, but when I try to create an index on any of the columns the > process will run for days without finishing. > > If the entire operation can be done in RAM, then it is possible for it to > finish in a few hours, but the second the sqlite process has to go to disk, > it grinds to a slow crawl. > > Is it possible to split the index operation into multiple phases? Or better > yet, flush the data when the cache is full and continue on? Any other ideas > on how to speed up the indexing other than reducing the size of the data so > that the operation can be done in RAM? ACID restrictions are unimportant > during this construction phase of the database.
Create the index prior to the data being populated and then insert a few thousand rows per transaction. Experiment with different sized batches. It will run much faster if the rows to be inserted are pre-sorted in index order. However, the problem with this incremental approach is that the database pages of the index will be scattered across the 6 Gig file leading to poor performance if the file is not pre-cached in RAM by the OS. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------