On Wed, Jul 07, 2010 at 01:03:26PM +0400, Igor Sereda scratched on the wall:
> > It's never time to VACUUM a database.
> 
> This is an interesting statement. In our application, all tables get
> heavily fragmented with time (99% or more). I was considering VACUUM
> as a means to defragment, and, presumably, improve search performance.
> Was I misguided, and search performance does not degrade significantly
> with the increased fragmentation of the DB?

  No, you're not misguided, but the full picture is deeper.

  The VACUUM will repack the pages, putting more user-data in a
  smaller footprint.  This improves I/O performance since you
  pull in more usable data with a given number of reads (at least,
  until things are mixed up again).  This will typically only show up
  when doing large scans of tables that have been really thrashed,
  however (rows added and deleted and added and deleted over and over
  and over).

  VACUUM also puts all the pages of a given database object in the same
  spot in the file.  That will help with reads, but only if the low
  level block assignments on the disk are continuous.  Just because the
  blocks are next to each other in a file doesn't mean they're next to
  each other on the disk.

  In theory, the VACUUM also provides the filesystem an opportunity
  to defragment the file itself, but it is less clear how much this 
  actually happens.  VACUUM rebuilds the database into a temp file and
  then copies it back, but the copy is done as a low level page-by-page
  overwrite-- it does not actually delete the old file and then just
  copy or rename the rebuilt one.  This makes the whole operation
  transaction safe (complete with roll back ability right up to the
  final moment of success), but it means the original database file
  stays more or less intact on the filesystem, and is simply
  over-written one section at a time.  Some filesystems will take
  the chance to merge the file blocks, some will not.

  If you really want the best performance, your best bet is to copy the
  file after it has been VACUUMed.  Or defragment the filesystem.  Or
  use an auto-defragmenting filesystem.

> (I guess it well might not on an SSD disk, but on a conventional
> rotational disk, pager could read several pages ahead with one seek -
> but does it?)

  No, the pager does not.  Among other things, my feeling is that the
  locality of pages is not very strong, unless the database was just
  VACUUMed.  Memory for the page cache also tends to be very guarded.

  On a modern desktop system, it is likely the filesystem and/or the
  disk controller will do some level of read-ahead, however.  Any
  benefit will likely require contiguous physical layout.



  Overall, I agree with most others... if you're just starting
  application development, this should be way way down on your list of
  concerns.  It is likely this concern can even drop off the list,
  unless you know your files are significant in size, the database
  lifespan is significant, and the database will be subjected to an
  extremely high amount of flux and row thrashing.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to