Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Michael Stone mstone+postgres 'at' mathom.us writes: On Tue, May 15, 2007 at 06:43:50PM +0200, Guillaume Cottenceau wrote: patch - basically, I think the documentation under estimates (or sometimes misses) the benefit of VACUUM FULL for scans, and the needs of VACUUM FULL if the routine VACUUM hasn't been done properly since the database was put in production. It's also possible to overestimate the benefit of vacuum full, leading to people vacuum full'ing almost constantly, then complaining about performance due to the associated overhead. I think there have been more people on this list whose performance problems were caused by unnecessary full vacs than by those whose performance problems were caused by insufficient full vacs. Come on, I don't suggest to remove several bold warnings about it, the best one being Therefore, frequently using VACUUM FULL can have an extremely negative effect on the performance of concurrent database queries. My point is to add the few additional mentions; I don't think the claims that VACUUM FULL physically compacts the data, and might be useful in case of too long time with infrequent VACUUM are incorrect, are they? -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
On Wed, May 16, 2007 at 09:41:46AM +0200, Guillaume Cottenceau wrote: Michael Stone mstone+postgres 'at' mathom.us writes: On Tue, May 15, 2007 at 06:43:50PM +0200, Guillaume Cottenceau wrote: patch - basically, I think the documentation under estimates (or sometimes misses) the benefit of VACUUM FULL for scans, and the needs of VACUUM FULL if the routine VACUUM hasn't been done properly since the database was put in production. It's also possible to overestimate the benefit of vacuum full, leading to people vacuum full'ing almost constantly, then complaining about performance due to the associated overhead. I think there have been more people on this list whose performance problems were caused by unnecessary full vacs than by those whose performance problems were caused by insufficient full vacs. Come on, I don't suggest to remove several bold warnings about it, the best one being Therefore, frequently using VACUUM FULL can have an extremely negative effect on the performance of concurrent database queries. My point is to add the few additional mentions; I don't think the claims that VACUUM FULL physically compacts the data, and might be useful in case of too long time with infrequent VACUUM are incorrect, are they? Unfortunately they are, to a degree. VACUUM FULL can create a substantial amount of churn in the indexes, resulting in bloated indexes. So often you have to REINDEX after you VACUUM FULL. Long term I think we should ditch 'VACUUM FULL' altogether and create a COMPACT command (it's very easy for users to get confused between vacuum all the databases in the cluster or vacuum the entire database and VACUUM FULL). -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Jim C. Nasby decibel 'at' decibel.org writes: On Wed, May 16, 2007 at 09:41:46AM +0200, Guillaume Cottenceau wrote: [...] Come on, I don't suggest to remove several bold warnings about it, the best one being Therefore, frequently using VACUUM FULL can have an extremely negative effect on the performance of concurrent database queries. My point is to add the few additional mentions; I don't think the claims that VACUUM FULL physically compacts the data, and might be useful in case of too long time with infrequent VACUUM are incorrect, are they? Unfortunately they are, to a degree. VACUUM FULL can create a substantial amount of churn in the indexes, resulting in bloated indexes. So often you have to REINDEX after you VACUUM FULL. Ok, VACUUM FULL does his job (it physically compacts the data and might be useful in case of too long time with infrequent VACUUM), but we are going to not talk about it because we often needs a REINDEX after it? The natural conclusion would rather be to document the fact than REINDEX is needed after VACUUM FULL, isn't it? -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Guillaume Cottenceau wrote: Jim C. Nasby decibel 'at' decibel.org writes: On Wed, May 16, 2007 at 09:41:46AM +0200, Guillaume Cottenceau wrote: [...] Come on, I don't suggest to remove several bold warnings about it, the best one being Therefore, frequently using VACUUM FULL can have an extremely negative effect on the performance of concurrent database queries. My point is to add the few additional mentions; I don't think the claims that VACUUM FULL physically compacts the data, and might be useful in case of too long time with infrequent VACUUM are incorrect, are they? Unfortunately they are, to a degree. VACUUM FULL can create a substantial amount of churn in the indexes, resulting in bloated indexes. So often you have to REINDEX after you VACUUM FULL. Ok, VACUUM FULL does his job (it physically compacts the data and might be useful in case of too long time with infrequent VACUUM), but we are going to not talk about it because we often needs a REINDEX after it? The natural conclusion would rather be to document the fact than REINDEX is needed after VACUUM FULL, isn't it? Maybe, but we should also mention that CLUSTER is a likely faster workaround. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
On Wed, May 16, 2007 at 12:09:26PM -0400, Alvaro Herrera wrote: Guillaume Cottenceau wrote: Jim C. Nasby decibel 'at' decibel.org writes: On Wed, May 16, 2007 at 09:41:46AM +0200, Guillaume Cottenceau wrote: [...] Come on, I don't suggest to remove several bold warnings about it, the best one being Therefore, frequently using VACUUM FULL can have an extremely negative effect on the performance of concurrent database queries. My point is to add the few additional mentions; I don't think the claims that VACUUM FULL physically compacts the data, and might be useful in case of too long time with infrequent VACUUM are incorrect, are they? Unfortunately they are, to a degree. VACUUM FULL can create a substantial amount of churn in the indexes, resulting in bloated indexes. So often you have to REINDEX after you VACUUM FULL. Ok, VACUUM FULL does his job (it physically compacts the data and might be useful in case of too long time with infrequent VACUUM), but we are going to not talk about it because we often needs a REINDEX after it? The natural conclusion would rather be to document the fact than REINDEX is needed after VACUUM FULL, isn't it? Maybe, but we should also mention that CLUSTER is a likely faster workaround. What this boils down to is that there should probably be a separate subsection that deals with Oh noes! My tables are too big! -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
On Wed, May 16, 2007 at 12:09:26PM -0400, Alvaro Herrera wrote: Maybe, but we should also mention that CLUSTER is a likely faster workaround. Unless, of course, you don't particularly care about the order of the items in your table; you might end up wasting vastly more time rewriting tables due to unnecessary clustering than for full vacuums on a table that doesn't need it. Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
[EMAIL PROTECTED] (Michael Stone) writes: On Wed, May 16, 2007 at 12:09:26PM -0400, Alvaro Herrera wrote: Maybe, but we should also mention that CLUSTER is a likely faster workaround. Unless, of course, you don't particularly care about the order of the items in your table; you might end up wasting vastly more time rewriting tables due to unnecessary clustering than for full vacuums on a table that doesn't need it. Actually, this is irrelevant. If CLUSTER is faster than VACUUM FULL (and if it isn't, in all cases, it *frequently* is, and probably will be, nearly always, soon), then it's a faster workaround. -- output = (cbbrowne @ linuxfinances.info) http://cbbrowne.com/info/oses.html What if you slept? And what if, in your sleep, you dreamed? And what if, in your dream, you went to heaven and there plucked a strange and beautiful flower? And what if, when you awoke, you had the flower in your hand? Ah, what then? --Coleridge ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
On Wed, May 16, 2007 at 03:34:42PM -0400, Chris Browne wrote: [EMAIL PROTECTED] (Michael Stone) writes: Unless, of course, you don't particularly care about the order of the items in your table; you might end up wasting vastly more time rewriting tables due to unnecessary clustering than for full vacuums on a table that doesn't need it. Actually, this is irrelevant. I think it's perfectly relevant. If CLUSTER is faster than VACUUM FULL (and if it isn't, in all cases, it *frequently* is, and probably will be, nearly always, soon), then it's a faster workaround. Cluster reorders the table. If a table doesn't have any dead rows and you tell someone to run cluster or vacuum full, the vaccuum basically won't do anything and the cluster will reorder the whole table. Cluster is great for certain access patterns, but I've been noticing this odd tendency lately to treat it like a silver bullet. Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Michael Stone wrote: On Wed, May 16, 2007 at 03:34:42PM -0400, Chris Browne wrote: [EMAIL PROTECTED] (Michael Stone) writes: Unless, of course, you don't particularly care about the order of the items in your table; you might end up wasting vastly more time rewriting tables due to unnecessary clustering than for full vacuums on a table that doesn't need it. Actually, this is irrelevant. I think it's perfectly relevant. If CLUSTER is faster than VACUUM FULL (and if it isn't, in all cases, it *frequently* is, and probably will be, nearly always, soon), then it's a faster workaround. Cluster reorders the table. If a table doesn't have any dead rows and you tell someone to run cluster or vacuum full, the vaccuum basically won't do anything and the cluster will reorder the whole table. Cluster is great for certain access patterns, but I've been noticing this odd tendency lately to treat it like a silver bullet. Well, it's certainly not a silver bullet; you would use VACUUM (not full) for most of your needs, and CLUSTER for the rare other cases. Of course you would not pick an index at random each time, but rather keep using the same one, which would supposedly be faster. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Michael Stone [EMAIL PROTECTED] writes: On Wed, May 16, 2007 at 03:34:42PM -0400, Chris Browne wrote: If CLUSTER is faster than VACUUM FULL (and if it isn't, in all cases, it *frequently* is, and probably will be, nearly always, soon), then it's a faster workaround. Cluster reorders the table. If a table doesn't have any dead rows and you tell someone to run cluster or vacuum full, the vaccuum basically won't do anything and the cluster will reorder the whole table. Cluster is great for certain access patterns, but I've been noticing this odd tendency lately to treat it like a silver bullet. Sure, but VACUUM FULL looks even less like a silver bullet. There's been talk of providing an operation that uses the same infrastructure as CLUSTER, but doesn't make any attempt to re-order the table: just seqscan the old heap, transfer still-live tuples into a new heap, then rebuild indexes from scratch. This is clearly going to be a lot faster than a VACUUM FULL under conditions in which the latter would have to move most of the tuples. Heikki just fixed one of the major objections to it (ie, CLUSTER not being MVCC-safe). The other objection is that peak transient disk space usage could be much higher than VACUUM FULL's, but still for a lot of scenarios this'd be better. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
Dear all, After some time spent better understanding how the VACUUM process works, what problems we had in production and how to improve our maintenance policy[1], I've come up with a little documentation patch - basically, I think the documentation under estimates (or sometimes misses) the benefit of VACUUM FULL for scans, and the needs of VACUUM FULL if the routine VACUUM hasn't been done properly since the database was put in production. Find the patch against snapshot attached (text not filled, to ease reading). It might help others in my situation in the future. --- doc/src/sgml/ref/vacuum.sgml 2007-02-01 00:26:04.0 +0100 +++ /tmp/vacuum.sgml 2007-05-15 18:32:14.0 +0200 @@ -164,8 +164,8 @@ para The optionFULL/option option is not recommended for routine use, but might be useful in special cases. An example is when you have deleted -most of the rows in a table and would like the table to physically shrink -to occupy less disk space. commandVACUUM FULL/command will usually +or updated most of the rows in a table and would like the table to physically shrink +to occupy less disk space and allow faster table scans. commandVACUUM FULL/command will usually shrink the table more than a plain commandVACUUM/command would. The optionFULL/option option does not shrink indexes; a periodic commandREINDEX/ is still recommended. In fact, it is often faster --- doc/src/sgml/maintenance.sgml 2007-05-03 17:47:48.0 +0200 +++ /tmp/maintenance.sgml 2007-05-15 18:29:29.0 +0200 @@ -157,7 +157,8 @@ command. This uses a more aggressive algorithm for reclaiming the space consumed by dead row versions. Any space that is freed by commandVACUUM FULL/command is immediately returned to the -operating system. Unfortunately, this variant of the +operating system, and the table data is physically compacted on +the disk. Unfortunately, this variant of the commandVACUUM/command command acquires an exclusive lock on each table while commandVACUUM FULL/command is processing it. Therefore, frequently using commandVACUUM FULL/command can @@ -168,12 +169,15 @@ para The standard form of commandVACUUM/ is best used with the goal of maintaining a fairly level steady-state usage of disk space. If -you need to return disk space to the operating system you can use +you need to return disk space to the operating system, you can use commandVACUUM FULL/ mdash; but what's the point of releasing disk space that will only have to be allocated again soon? Moderately frequent standard commandVACUUM/ runs are a better approach than infrequent commandVACUUM FULL/ runs for maintaining -heavily-updated tables. +heavily-updated tables. However, if some heavily-updated tables +have gone too long with infrequent commandVACUUM/, you can +use commandVACUUM FULL/ to get performance back (it is much +slower to scan a table containing almost only dead rows). /para para Ref: [1] http://archives.postgresql.org/pgsql-performance/2006-08/msg00419.php http://archives.postgresql.org/pgsql-performance/2007-05/msg00112.php -- Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [doc patch] a slight VACUUM / VACUUM FULL doc improvement proposal
On Tue, May 15, 2007 at 06:43:50PM +0200, Guillaume Cottenceau wrote: patch - basically, I think the documentation under estimates (or sometimes misses) the benefit of VACUUM FULL for scans, and the needs of VACUUM FULL if the routine VACUUM hasn't been done properly since the database was put in production. It's also possible to overestimate the benefit of vacuum full, leading to people vacuum full'ing almost constantly, then complaining about performance due to the associated overhead. I think there have been more people on this list whose performance problems were caused by unnecessary full vacs than by those whose performance problems were caused by insufficient full vacs. Mike Stone ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly