On Thu, Sep 02, 2010 at 05:42:17AM +0200, Ben Danper scratched on the wall: > > On Wed, Sep 1, 2010 at 12:46 PM, Jay A. Kreibich <j...@...> wrote: > > There is no reason to assume the filesystem > > will over-write the existing allocations, rather than just create new > > ones, especially if the pages are shuffled in groups... > > Actually there's no reason to do the opposite, > as it would fragment files that were contiguous in the first place.
If a filesystem is asked to write out a buffer that is large enough to consist of several allocation blocks, it makes sense to write that buffer out as a contiguous set of blocks and reassign the allocations if the existing blocks are already fragmented. That wouldn't change a file that is already reasonable contiguous, but would tend to bring a file back together. Such a behavior would almost never fully defragment the file, but it would help... assuming at least a few writes are large, logically contiguous writes. I'm guessing SQLite doesn't tend to do that. There are also flash-optimized filesystems that move just about every write in an attempt to even out the flash write cycles. > > Maybe there would be some way to pre-populate the rollback journal > > with the full contents of the original database. Then the file could > > be truncated before the copy-back procedure. That would make it > > clear to the OS that it is free to allocate whatever file blocks it > > wants, hopefully in better patterns. The copy back could also be > > done in very large chunks. > > This is a fantastic idea! Not only truncate - since you know the new > size, you could also set the size beforehand before you start > copying the pages (similar to SQLITE_FCNTL_CHUNK_SIZE). Most > filesystems will try very hard to place it contiguously. Good idea. > A more involved idea that would improve efficiency (two copies instead > of three, and twice the database size instead of three times) would > be to use the journal file directly as the new database That does sound a lot more involved. You would more or less need to rewrite the whole pager to deal with two different file formats. The VACUUM copy process is not write-only, it logically rebuilds the database from the ground up using SQL commands. That means it does stuff like issue "CREATE INDEX..." commands on fresh tables. You would need full read/write/update support for the "journal pager." -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users