I had the opportunity to bring this patch up for discussion at the
developer meeting at FOSDEM PGDay last week [0].  We discussed a subset
of the topics folks have already written about in this thread, and AFAICT
there was general approval among the attendees for proceeding with the
"hard cap" approach due to its user-friendliness.  Given that, I am
planning to commit the attached patch in the near future (although I may
fiddle with the commit message a bit more).

On Tue, Jan 14, 2025 at 11:35:17PM +0300, Alena Rybakina wrote:
> #autovacuum_vacuum_max_threshold = 100000000    # max number of row updates
>                         # before vacuum; -1 disables max
>                         # threshold
> 
> I think instead of "# threshold" should be "#vacuum"?

That would more closely match the description of
autovacuum_vacuum_insert_threshold, which refers to "insert vacuums," but I
felt it would be weird to refer to "max vacuums."  IMHO it is clearer to
say that -1 disables the maximum threshold here.

> There is a typo:
> 
> * if (threshold > vac_max_thresh)
> * threshold = vac_max_thres; - here

Fixed.

> I think you should add more information to the description of the
> Relations_needs_vacanalyze function: what is vac_max_thresh and how is it
> calculated. It is not clear what the below condition means.
> 
> /* -1 is used to disable max threshold */
> vac_max_thresh= (relopts&& relopts->vacuum_max_threshold>= -1)
> ? relopts->vacuum_max_threshold
> : autovacuum_vac_max_thresh;

I looked at the commentary for this function and felt that the comments for
this new parameter are in line with the comments for all the adjacent
parameters.  There may be an opportunity to improve this commentary, but
IMHO that would be better handled in a separate patch that improved it for
all these parameters.

[0] https://2025.fosdempgday.org/devmeeting

-- 
nathan
>From 392b07ea4d1a14d045937894d29e6679199a513f Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nat...@postgresql.org>
Date: Mon, 3 Feb 2025 13:20:46 -0600
Subject: [PATCH v7 1/1] Introduce autovacuum_vacuum_max_threshold.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

One way we trigger an autovacuum on a table is to compare the
number of updated or deleted tuples with a value calculated using
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor.
The threshold provides the base value for comparison, and the scale
factor provides the fraction of the table size to add to that
value.  This strategy ensures that smaller tables are vacuumed
after fewer updates/deletes than larger tables, which is reasonable
in many cases but can result in infrequent vacuums of very large
tables.  This is undesirable for a couple of reasons, such as very
large tables acquiring a huge amount of bloat between vacuums.

This new parameter provides a way to set a cap on the value
calculated with autovacuum_vacuum_threshold and
autovacuum_vacuum_scale_factor, which is intended to help trigger
more frequent vacuums on very large tables.  By default, it is set
to 100,000,000 tuples, but it can be disabled by setting it to -1.
It can also be adjusted for individual tables by changing storage
parameters.

Co-authored-by: Frédéric Yhuel <frederic.yh...@dalibo.com>
Reviewed-by: Melanie Plageman <melanieplage...@gmail.com>
Reviewed-by: Robert Haas <robertmh...@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.a...@cybertec.at>
Reviewed-by: Michael Banck <mba...@gmx.net>
Reviewed-by: Joe Conway <m...@joeconway.com>
Reviewed-by: Sami Imseih <samims...@gmail.com>
Reviewed-by: David Rowley <dgrowle...@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhu...@gmail.com>
Reviewed-by: Robert Treat <r...@xzilla.net>
Reviewed-by: Alena Rybakina <a.rybak...@postgrespro.ru>
Discussion: 
https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
---
 doc/src/sgml/config.sgml                      | 24 +++++++++++++++++++
 doc/src/sgml/maintenance.sgml                 |  6 +++--
 doc/src/sgml/ref/create_table.sgml            | 15 ++++++++++++
 src/backend/access/common/reloptions.c        | 11 +++++++++
 src/backend/postmaster/autovacuum.c           | 12 ++++++++++
 src/backend/utils/misc/guc_tables.c           |  9 +++++++
 src/backend/utils/misc/postgresql.conf.sample |  3 +++
 src/bin/psql/tab-complete.in.c                |  2 ++
 src/include/postmaster/autovacuum.h           |  1 +
 src/include/utils/rel.h                       |  1 +
 10 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f109982..1e4bb613a98 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8580,6 +8580,30 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;
        </listitem>
       </varlistentry>
 
+      <varlistentry id="guc-autovacuum-vacuum-max-threshold" 
xreflabel="autovacuum_vacuum_max_threshold">
+       <term><varname>autovacuum_vacuum_max_threshold</varname> 
(<type>integer</type>)
+       <indexterm>
+        <primary><varname>autovacuum_vacuum_max_threshold</varname></primary>
+        <secondary>configuration parameter</secondary>
+       </indexterm>
+       </term>
+       <listitem>
+        <para>
+         Specifies the maximum number of updated or deleted tuples needed to
+         trigger a <command>VACUUM</command> in any one table, i.e., a cap on
+         the value calculated with
+         <varname>autovacuum_vacuum_threshold</varname> and
+         <varname>autovacuum_vacuum_scale_factor</varname>.  The default is
+         100,000,000 tuples.  If -1 is specified, autovacuum will not enforce a
+         maximum number of updated or deleted tuples that will trigger a
+         <command>VACUUM</command> operation.  This parameter can only be set
+         in the <filename>postgresql.conf</filename> file or on the server
+         command line; but the setting can be overridden for individual tables
+         by changing storage parameters.
+        </para>
+       </listitem>
+      </varlistentry>
+
       <varlistentry id="guc-autovacuum-vacuum-insert-threshold" 
xreflabel="autovacuum_vacuum_insert_threshold">
        <term><varname>autovacuum_vacuum_insert_threshold</varname> 
(<type>integer</type>)
        <indexterm>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0be90bdc7ef..f84ad7557d9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -895,9 +895,11 @@ HINT:  Execute a database-wide VACUUM in that database.
     <command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
     table is vacuumed.  The vacuum threshold is defined as:
 <programlisting>
-vacuum threshold = vacuum base threshold + vacuum scale factor * number of 
tuples
+vacuum threshold = Minimum(vacuum max threshold, vacuum base threshold + 
vacuum scale factor * number of tuples)
 </programlisting>
-    where the vacuum base threshold is
+    where the vacuum max threshold is
+    <xref linkend="guc-autovacuum-vacuum-max-threshold"/>,
+    the vacuum base threshold is
     <xref linkend="guc-autovacuum-vacuum-threshold"/>,
     the vacuum scale factor is
     <xref linkend="guc-autovacuum-vacuum-scale-factor"/>,
diff --git a/doc/src/sgml/ref/create_table.sgml 
b/doc/src/sgml/ref/create_table.sgml
index 2237321cb4f..417498f71db 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1712,6 +1712,21 @@ WITH ( MODULUS <replaceable 
class="parameter">numeric_literal</replaceable>, REM
     </listitem>
    </varlistentry>
 
+   <varlistentry id="reloption-autovacuum-vacuum-max-threshold" 
xreflabel="autovacuum_vacuum_max_threshold">
+    <term><literal>autovacuum_vacuum_max_threshold</literal>, 
<literal>toast.autovacuum_vacuum_max_threshold</literal> (<type>integer</type>)
+    <indexterm>
+     <primary><varname>autovacuum_vacuum_max_threshold</varname></primary>
+     <secondary>storage parameter</secondary>
+    </indexterm>
+    </term>
+   <listitem>
+    <para>
+     Per-table value for <xref linkend="guc-autovacuum-vacuum-max-threshold"/>
+     parameter.
+    </para>
+   </listitem>
+  </varlistentry>
+
    <varlistentry id="reloption-autovacuum-vacuum-scale-factor" 
xreflabel="autovacuum_vacuum_scale_factor">
     <term><literal>autovacuum_vacuum_scale_factor</literal>, 
<literal>toast.autovacuum_vacuum_scale_factor</literal> (<type>floating 
point</type>)
     <indexterm>
diff --git a/src/backend/access/common/reloptions.c 
b/src/backend/access/common/reloptions.c
index e587abd9990..5731cf42f54 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -231,6 +231,15 @@ static relopt_int intRelOpts[] =
                },
                -1, 0, INT_MAX
        },
+       {
+               {
+                       "autovacuum_vacuum_max_threshold",
+                       "Maximum number of tuple updates or deletes prior to 
vacuum",
+                       RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
+                       ShareUpdateExclusiveLock
+               },
+               -2, -1, INT_MAX
+       },
        {
                {
                        "autovacuum_vacuum_insert_threshold",
@@ -1843,6 +1852,8 @@ default_reloptions(Datum reloptions, bool validate, 
relopt_kind kind)
                offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, 
enabled)},
                {"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
                offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, 
vacuum_threshold)},
+               {"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
+               offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, 
vacuum_max_threshold)},
                {"autovacuum_vacuum_insert_threshold", RELOPT_TYPE_INT,
                offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, 
vacuum_ins_threshold)},
                {"autovacuum_analyze_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index 0ab921a169b..09ec9bb6990 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -120,6 +120,7 @@ int                 autovacuum_max_workers;
 int                    autovacuum_work_mem = -1;
 int                    autovacuum_naptime;
 int                    autovacuum_vac_thresh;
+int                    autovacuum_vac_max_thresh;
 double         autovacuum_vac_scale;
 int                    autovacuum_vac_ins_thresh;
 double         autovacuum_vac_ins_scale;
@@ -2895,6 +2896,8 @@ recheck_relation_needs_vacanalyze(Oid relid,
  * threshold.  This threshold is calculated as
  *
  * threshold = vac_base_thresh + vac_scale_factor * reltuples
+ * if (threshold > vac_max_thresh)
+ *     threshold = vac_max_thresh;
  *
  * For analyze, the analysis done is that the number of tuples inserted,
  * deleted and updated since the last analyze exceeds a threshold calculated
@@ -2933,6 +2936,7 @@ relation_needs_vacanalyze(Oid relid,
 
        /* constants from reloptions or GUC variables */
        int                     vac_base_thresh,
+                               vac_max_thresh,
                                vac_ins_base_thresh,
                                anl_base_thresh;
        float4          vac_scale_factor,
@@ -2974,6 +2978,11 @@ relation_needs_vacanalyze(Oid relid,
                ? relopts->vacuum_threshold
                : autovacuum_vac_thresh;
 
+       /* -1 is used to disable max threshold */
+       vac_max_thresh = (relopts && relopts->vacuum_max_threshold >= -1)
+               ? relopts->vacuum_max_threshold
+               : autovacuum_vac_max_thresh;
+
        vac_ins_scale_factor = (relopts && relopts->vacuum_ins_scale_factor >= 
0)
                ? relopts->vacuum_ins_scale_factor
                : autovacuum_vac_ins_scale;
@@ -3047,6 +3056,9 @@ relation_needs_vacanalyze(Oid relid,
                        reltuples = 0;
 
                vacthresh = (float4) vac_base_thresh + vac_scale_factor * 
reltuples;
+               if (vac_max_thresh >= 0 && vacthresh > (float4) vac_max_thresh)
+                       vacthresh = (float4) vac_max_thresh;
+
                vacinsthresh = (float4) vac_ins_base_thresh + 
vac_ins_scale_factor * reltuples;
                anlthresh = (float4) anl_base_thresh + anl_scale_factor * 
reltuples;
 
diff --git a/src/backend/utils/misc/guc_tables.c 
b/src/backend/utils/misc/guc_tables.c
index 71448bb4fdd..b887d3e5983 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3426,6 +3426,15 @@ struct config_int ConfigureNamesInt[] =
                50, 0, INT_MAX,
                NULL, NULL, NULL
        },
+       {
+               {"autovacuum_vacuum_max_threshold", PGC_SIGHUP, 
VACUUM_AUTOVACUUM,
+                       gettext_noop("Maximum number of tuple updates or 
deletes prior to vacuum."),
+                       NULL
+               },
+               &autovacuum_vac_max_thresh,
+               100000000, -1, INT_MAX,
+               NULL, NULL, NULL
+       },
        {
                {"autovacuum_vacuum_insert_threshold", PGC_SIGHUP, 
VACUUM_AUTOVACUUM,
                        gettext_noop("Minimum number of tuple inserts prior to 
vacuum, or -1 to disable insert vacuums."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample 
b/src/backend/utils/misc/postgresql.conf.sample
index 079efa1baa7..d38112f3943 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -661,6 +661,9 @@ autovacuum_worker_slots = 16        # autovacuum worker 
slots to allocate
 #autovacuum_naptime = 1min             # time between autovacuum runs
 #autovacuum_vacuum_threshold = 50      # min number of row updates before
                                        # vacuum
+#autovacuum_vacuum_max_threshold = 100000000   # max number of row updates
+                                               # before vacuum; -1 disables max
+                                               # threshold
 #autovacuum_vacuum_insert_threshold = 1000     # min number of row inserts
                                                # before vacuum; -1 disables 
insert
                                                # vacuums
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 81cbf10aa28..5f6897c8486 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1368,6 +1368,7 @@ static const char *const table_storage_parameters[] = {
        "autovacuum_vacuum_cost_limit",
        "autovacuum_vacuum_insert_scale_factor",
        "autovacuum_vacuum_insert_threshold",
+       "autovacuum_vacuum_max_threshold",
        "autovacuum_vacuum_scale_factor",
        "autovacuum_vacuum_threshold",
        "fillfactor",
@@ -1384,6 +1385,7 @@ static const char *const table_storage_parameters[] = {
        "toast.autovacuum_vacuum_cost_limit",
        "toast.autovacuum_vacuum_insert_scale_factor",
        "toast.autovacuum_vacuum_insert_threshold",
+       "toast.autovacuum_vacuum_max_threshold",
        "toast.autovacuum_vacuum_scale_factor",
        "toast.autovacuum_vacuum_threshold",
        "toast.log_autovacuum_min_duration",
diff --git a/src/include/postmaster/autovacuum.h 
b/src/include/postmaster/autovacuum.h
index 54e01c81d68..06d4a593575 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -33,6 +33,7 @@ extern PGDLLIMPORT int autovacuum_max_workers;
 extern PGDLLIMPORT int autovacuum_work_mem;
 extern PGDLLIMPORT int autovacuum_naptime;
 extern PGDLLIMPORT int autovacuum_vac_thresh;
+extern PGDLLIMPORT int autovacuum_vac_max_thresh;
 extern PGDLLIMPORT double autovacuum_vac_scale;
 extern PGDLLIMPORT int autovacuum_vac_ins_thresh;
 extern PGDLLIMPORT double autovacuum_vac_ins_scale;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 33d1e4a4e2e..48b95f211f3 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -309,6 +309,7 @@ typedef struct AutoVacOpts
 {
        bool            enabled;
        int                     vacuum_threshold;
+       int                     vacuum_max_threshold;
        int                     vacuum_ins_threshold;
        int                     analyze_threshold;
        int                     vacuum_cost_limit;
-- 
2.39.5 (Apple Git-154)

Reply via email to