On Wed, 26 May 2010 10:46:43 +1000, Owen wrote:

>So I read the main database row by row, tested the validity of one
>column, and inserted the good rows into a new database.
>
>This process took 27 minutes on a fairly recent desktop computer.
>( about 10000 rows a minute or 170 a second ) 
>
>I thought the whole process would take a few seconds, so I wonder if 27
>minutes is a reasonable time for this database creation.

Well I timed actions using SQLite some time ago and I am lead to believe
that every commit produces an fsync() in the database, which takes
roughly 100ms on my old 1MHz PC running Linux, and which you may expect
to take 50ms on a more modern computer.

2 possible solutions are: 

 1) don't commit on every single row, keep an eye on the clock and
commit every N seconds (I'm thinking of 1/2 minute) plus once at the
very end, of course

 2) edit the C source of SQLite and replace the fsync() with fflush(),
which ought to be around 100 times faster, but which isn't garanteed to
withstand computer crashes, if it goes down before the data is finally
stored on disk - not ideal in the eye of ACID perfectionists.

Anyway, with current cache sizes inside the disk, nothing is garanteed
after an fsync(), anyway. (Data might be in the disk's own cache but not
yet committed to the physical disk)

>As the database was being created, I noticed a journal file was in use
>though I haven't worked out what that did, yet.

It SQLite's scratchpad, where it stores its intentions for each
transaction. If the computer goes down before a transaction is
finalized, this could likely be used to redo the action and this time,
complete it. Or, roll it back properly.

-- 
        Bart Lateur
        bart.lat...@telenet.be

Reply via email to