On Thu, 19 Mar 2020 at 18:45, Laurenz Albe <laurenz.a...@cybertec.at> wrote: > > On Tue, 2020-03-17 at 18:02 -0700, Andres Freund wrote: > > I don't think a default scale factor of 0 is going to be ok. For > > large-ish tables this will basically cause permanent vacuums. And it'll > > sometimes trigger for tables that actually coped well so far. 10 million > > rows could be a few seconds, not more. > > > > I don't think that the argument that otherwise a table might not get > > vacuumed before autovacuum_freeze_max_age is convincing enough. > > > > a) if that's indeed the argument, we should increase the default > > autovacuum_freeze_max_age - now that there's insert triggered vacuums, > > the main argument against that from before isn't valid anymore. > > > > b) there's not really a good arguments for vacuuming more often than > > autovacuum_freeze_max_age for such tables. It'll not be not frequent > > enough to allow IOS for new data, and you're not preventing > > anti-wraparound vacuums from happening. > > According to my reckoning, that is the remaining objection to the patch > as it is (with ordinary freezing behavior). > > How about a scale_factor od 0.005? That will be high enough for large > tables, which seem to be the main concern here.
I agree with that, however, I'd thought 0.01, just so we're still close to having about 100 times less work to do for huge insert-only tables when it comes to having to perform an anti-wraparound vacuum. > I fully agree with your point a) - should that be part of the patch? I think it will be a good idea to increase this, but I really don't think this patch should be touching it. It's something to put on the issues list for after the CF so more people have the bandwidth to chip in their thoughts. > I am not sure about b). In my mind, the objective is not to prevent > anti-wraparound vacuums, but to see that they have less work to do, > because previous autovacuum runs already have frozen anything older than > vacuum_freeze_min_age. So, assuming linear growth, the number of tuples > to freeze during any run would be at most one fourth of today's number > when we hit autovacuum_freeze_max_age. I hear what Andres is saying about proactive freezing for already dirty pages. I think that's worth looking into, but don't feel like we need to do it for this patch. The patch is worthy without it and such a change affects more than insert-vacuums, so should be a separate commit. If people really do have an insert-only table then we can recommend that they set the table's autovacuum_freeze_min_age to 0. > I am still sorry to see more proactive freezing go, which would > reduce the impact for truly insert-only tables. > After sleeping on it, here is one last idea. > > Granted, freezing with vacuum_freeze_min_age = 0 poses a problem > for those parts of the table that will receive updates or deletes. > But what if insert-triggered vacuum operates with - say - > one tenth of vacuum_freeze_min_age (unless explicitly overridden > for the table)? That might still be high enough not to needlessly > freeze too many tuples that will still be modified, but it will > reduce the impact on insert-only tables. I think that might be a bit too magical and may not be what some people want. I know that most people won't set autovacuum_freeze_min_age to 0 for insert-only tables, but we can at least throw something in the documents to mention it's a good idea, however, looking over the docs I'm not too sure the best place to note that down. I've attached a small fix which I'd like to apply to your v8 patch. With that, and pending one final look, I'd like to push this during my Monday (New Zealand time). So if anyone strongly objects to that, please state their case before then. David
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7befc63860..6cad079132 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7342,7 +7342,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; Specifies a fraction of the table size to add to <varname>autovacuum_vacuum_insert_threshold</varname> when deciding whether to trigger a <command>VACUUM</command>. - The default is 0.0, which means that the table size has no effect. + The default is 0.01 (1% of table size). This parameter can only be set in the <filename>postgresql.conf</filename> file or on the server command line; but the setting can be overridden for individual tables by diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index dbf418c62a..904fbffd94 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -777,20 +777,28 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple <xref linkend="guc-autovacuum-vacuum-scale-factor"/>, and the number of tuples is <structname>pg_class</structname>.<structfield>reltuples</structfield>. - The number of obsolete tuples is obtained from the statistics - collector; it is a semi-accurate count updated by each - <command>UPDATE</command> and <command>DELETE</command> operation. (It - is only semi-accurate because some information might be lost under heavy - load.) If the <structfield>relfrozenxid</structfield> value of the table is more + The table is also vacuumed if the number of tuples inserted since the last + vacuum have exceeded the defined insert threshold, which is defined as: +<programlisting> +vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples +</programlisting> + where the vacuum insert base threshold is + <xref linkend="guc-autovacuum-vacuum-insert-threshold"/>, + and vacuum insert scale factor is + <xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>. + Such vacuums may allow the table's <structfield>relfrozenxid</structfield> + to be advanced and will reduce the chances of and the impact of any + aggressive vacuums. + The number of obsolete tuples and the number of inserted tuples are + obtained from the statistics collector; it is a semi-accurate count updated by each + <command>UPDATE</command>, <command>DELETE</command> and + <command>INSERT</command> operation. (It is only semi-accurate because + some information might be lost under heavy load.) If the + <structfield>relfrozenxid</structfield> value of the table is more than <varname>vacuum_freeze_table_age</varname> transactions old, an aggressive vacuum is performed to freeze old tuples and advance <structfield>relfrozenxid</structfield>; otherwise, only pages that have been modified since the last vacuum are scanned. - Finally, a threshold similar to the above is calculated from - <xref linkend="guc-autovacuum-vacuum-insert-threshold"/> and - <xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>. - Tables that have received more inserts than the calculated threshold - since the last <command>VACUUM</command> will also be vacuumed. </para> <para> diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index b4974a5f20..b7c526abc3 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3545,7 +3545,7 @@ static struct config_real ConfigureNamesReal[] = NULL }, &autovacuum_vac_ins_scale, - 0.0, 0.0, 1e10, + 0.01, 0.0, 1e10, NULL, NULL, NULL }, diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index cc3df73a82..9988fe40f5 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -604,7 +604,7 @@ #autovacuum_analyze_threshold = 50 # min number of row updates before # analyze #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum -#autovacuum_vacuum_insert_scale_factor = 0.0 # fraction of table size before +#autovacuum_vacuum_insert_scale_factor = 0.01 # fraction of table size before # vacuum #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze #autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum