On Wed, Sep 01, 2010 at 11:41:00AM -0700, Taras Glek scratched on the wall: > Hi, > Currently VACUUM takes care of sqlite-level fragmentation. Unfortunately > it does little for fs-level fragmentation since the same file is being > reused. It would be really beneficial for Mozilla performance if we > could get a vacuum/hotcopy mode. > > As I understand it, currently vacuum works by copying out the data to a > new file, then copying it back in and truncating the original db file. > > It would be really nice to instead do something like: > > copy the data to a new file
You're good up to here. This is basically the "VACUUM TO <file>" idea I threw out some months ago, and Max referenced. > swap the underlying filehandles to point at new file > > remove old file > > rename the new file to old name. ...and watch every other database connection that was accessing the database file get very, very confused. You're also breaking transactional integrity. You need the option of backing-out of the operation right up until the moment it works, and this procedure can't do that. For example, if you lose power right after "remove old file", your database is no longer there. ...And before someone says something like "but the application can just open the other one", I agree-- that it is an application problem. > Currently we can copy the data to a new file, but it is a pretty > invasive change to swap all of the current sqlite connections to the new > file. You'd have to do that anyways, for all connections other than the one that issued the VACUUM command. Coordinating FD swaps across multiple connections in the same process would be confusing enough-- there would be no possible way to do it across database connections in multiple processes. And I really doubt you could do any of this in a transactional-safe way. It's application level stuff. > Things like prepared statements, etc need to be updated for every > single db consumer. Thus it would make sense to have this feature on the > sqlite side. I'm not sure I agree that it makes sense. As soon as you start to talk about filesystem operations (such as rename) you're way outside of the scope of SQLite. Again, doing this kind of thing in a transactional-safe way is nearly impossible. Even if we stick with the existing procedure, I'm wondering if it is really that bad (that's an honest question; I don't have data one way or the other). In the last step of the VACUUM process, the data is copied back to the original file, page by page. If I understand it correctly, this is more or less a blanket write, blasting pages from the temp DB to the original DB, front to back (intermixed with copies to the journal file). There is no reason to assume the filesystem will over-write the existing allocations, rather than just create new ones, especially if the pages are shuffled in groups... although if the goal is to defragment the file, we're kind of depending on that, which is unwise. Maybe there would be some way to pre-populate the rollback journal with the full contents of the original database. Then the file could be truncated before the copy-back procedure. That would make it clear to the OS that it is free to allocate whatever file blocks it wants, hopefully in better patterns. The copy back could also be done in very large chunks. I'm not totally sure, but I'm guessing that would require several modifications and enhancements to the rollback and B-tree code. You would need the ability to force a full rollback copy and the ability do a blind, non-journaled copy, but it seems like it could be done without modifications to the existing VACUUM semantics, all while keeping the transactional integrity of the VACUUM command. On a personal level, I don't think it is worth it. In the end, you're still hoping the OS and filesystem will make smart choices about block allocations. An application shouldn't need to be spending a lot of time worrying about this level of filesystem performance. No matter what, you're just hinting and setting up conditions that should allow the filesystem driver to do something smart and fast. It may, or it may not, actually do so. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users