Dear all, After some time spent better understanding how the VACUUM process works, what problems we had in production and how to improve our maintenance policy[1], I've come up with a little documentation patch - basically, I think the documentation under estimates (or sometimes misses) the benefit of VACUUM FULL for scans, and the needs of VACUUM FULL if the routine VACUUM hasn't been done properly since the database was put in production. Find the patch against snapshot attached (text not filled, to ease reading). It might help others in my situation in the future.
--- doc/src/sgml/ref/vacuum.sgml 2007-02-01 00:26:04.000000000 +0100 +++ /tmp/vacuum.sgml 2007-05-15 18:32:14.000000000 +0200 @@ -164,8 +164,8 @@ <para> The <option>FULL</option> option is not recommended for routine use, but might be useful in special cases. An example is when you have deleted - most of the rows in a table and would like the table to physically shrink - to occupy less disk space. <command>VACUUM FULL</command> will usually + or updated most of the rows in a table and would like the table to physically shrink + to occupy less disk space and allow faster table scans. <command>VACUUM FULL</command> will usually shrink the table more than a plain <command>VACUUM</command> would. The <option>FULL</option> option does not shrink indexes; a periodic <command>REINDEX</> is still recommended. In fact, it is often faster --- doc/src/sgml/maintenance.sgml 2007-05-03 17:47:48.000000000 +0200 +++ /tmp/maintenance.sgml 2007-05-15 18:29:29.000000000 +0200 @@ -157,7 +157,8 @@ command. This uses a more aggressive algorithm for reclaiming the space consumed by dead row versions. Any space that is freed by <command>VACUUM FULL</command> is immediately returned to the - operating system. Unfortunately, this variant of the + operating system, and the table data is physically compacted on + the disk. Unfortunately, this variant of the <command>VACUUM</command> command acquires an exclusive lock on each table while <command>VACUUM FULL</command> is processing it. Therefore, frequently using <command>VACUUM FULL</command> can @@ -168,12 +169,15 @@ <para> The standard form of <command>VACUUM</> is best used with the goal of maintaining a fairly level steady-state usage of disk space. If - you need to return disk space to the operating system you can use + you need to return disk space to the operating system, you can use <command>VACUUM FULL</> — but what's the point of releasing disk space that will only have to be allocated again soon? Moderately frequent standard <command>VACUUM</> runs are a better approach than infrequent <command>VACUUM FULL</> runs for maintaining - heavily-updated tables. + heavily-updated tables. However, if some heavily-updated tables + have gone too long with infrequent <command>VACUUM</>, you can + use <command>VACUUM FULL</> to get performance back (it is much + slower to scan a table containing almost only dead rows). </para> <para>
Ref: [1] http://archives.postgresql.org/pgsql-performance/2006-08/msg00419.php http://archives.postgresql.org/pgsql-performance/2007-05/msg00112.php -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36
---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend