On Wed, 26 May 2010 10:46:43 +1000, Owen wrote: >So I read the main database row by row, tested the validity of one >column, and inserted the good rows into a new database. > >This process took 27 minutes on a fairly recent desktop computer. >( about 10000 rows a minute or 170 a second ) > >I thought the whole process would take a few seconds, so I wonder if 27 >minutes is a reasonable time for this database creation.
Well I timed actions using SQLite some time ago and I am lead to believe that every commit produces an fsync() in the database, which takes roughly 100ms on my old 1MHz PC running Linux, and which you may expect to take 50ms on a more modern computer. 2 possible solutions are: 1) don't commit on every single row, keep an eye on the clock and commit every N seconds (I'm thinking of 1/2 minute) plus once at the very end, of course 2) edit the C source of SQLite and replace the fsync() with fflush(), which ought to be around 100 times faster, but which isn't garanteed to withstand computer crashes, if it goes down before the data is finally stored on disk - not ideal in the eye of ACID perfectionists. Anyway, with current cache sizes inside the disk, nothing is garanteed after an fsync(), anyway. (Data might be in the disk's own cache but not yet committed to the physical disk) >As the database was being created, I noticed a journal file was in use >though I haven't worked out what that did, yet. It SQLite's scratchpad, where it stores its intentions for each transaction. If the computer goes down before a transaction is finalized, this could likely be used to redo the action and this time, complete it. Or, roll it back properly. -- Bart Lateur bart.lat...@telenet.be