On Wed, Sep 1, 2010 at 12:46 PM, Jay A. Kreibich wrote:
> There is no reason to assume the filesystem
> will over-write the existing allocations, rather than just create new
> ones, especially if the pages are shuffled in groups...
Actually there's no reason to do the opposite, as it would fragment files that
were contiguous in the first place. The only filesystems that do this that I'm
aware of are those that do copy-on-write (and some move the old page, not the
new one). It certainly doesn't help in EXT2/3/4, or NTFS on Windows.
> Maybe there would be some way to pre-populate the rollback journal
> with the full contents of the original database. Then the file could
> be truncated before the copy-back procedure. That would make it
> clear to the OS that it is free to allocate whatever file blocks it
> wants, hopefully in better patterns. The copy back could also be
> done in very large chunks.
This is a fantastic idea! Not only truncate - since you know the new size, you
could also set the size beforehand before you start copying the pages (similar
to SQLITE_FCNTL_CHUNK_SIZE). Most filesystems will try very hard to place it
contiguously.
A more involved idea that would improve efficiency (two copies instead of
three, and twice the database size instead of three times) would be to use the
journal file directly as the new database (which right now it's created in the
temp folder). This involves writing an invalid journal header (typically 512
bytes of zeros), and then treating it as a normal database file, only starting
at offset 512, and with pages with are 8 bytes bigger, preceded by their number
(sequntial in the case) and followed by the fast checksum. Once the new
database is written (in journal format), the journal header can be made valid.
At this point the transaction has commited, and you can truncate and copy to
the original file as above (which is conceptually a rollback, only you're
rolling back to the vacuumed database).
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users