On Mon, Apr 3, 2023 at 8:37 PM David Rowley <dgrowle...@gmail.com> wrote:
>
> On Tue, 4 Apr 2023 at 02:49, Melanie Plageman <melanieplage...@gmail.com> 
> wrote:
> > v9 attached.
>
> I've made a pass on the v9-0001 patch only.  Here's what I noted down:

Thanks for the review!

Attached v10 addresses the review feedback below.

Remaining TODOs:
- tests
- do something about config reload changing GUC

> v9-0001:
>
> 1. In the documentation and comments, generally we always double-space
> after a period.  I see quite often you're not following this.

I've gone through and done this. I noticed after building the docs that
it doesn't seem to affect how many spaces are after a period in the
rendered docs, but I suppose it affects readability when editing the
sgml files.

> 2. Doc: We could generally seem to break tags within paragraphs into
> multiple lines. You're doing that quite a bit, e.g:
>
>         linkend="glossary-buffer-access-strategy">Buffer Access
>         Strategy</glossterm>. <literal>0</literal> will disable use of a

I've updated all of the ones I could find that I did this with.

> 2. This is not a command
>
>         <command>BUFFER_USAGE_LIMIT</command> parameter.
>
> <option> is probably what you want.

I have gone through and attempted to correct all
option/command/application tag usages.

> 3. I'm not sure I agree that it's a good idea to refer to the strategy
> with multiple different names. Here you've called it a "ring buffer",
> but in the next sentence, you're calling it a Buffer Access Strategy.
>
>       Specifies the ring buffer size for <command>VACUUM</command>. This size
>       is used to calculate the number of shared buffers which will be reused 
> as
>       part of a <glossterm linkend="glossary-buffer-access-strategy">Buffer
>       Access Strategy</glossterm>. <literal>0</literal> disables use of a

I've updated this to always prefix any use of ring with "Buffer Access
Strategy". I don't know how you'll feel about it. It felt awkward in
some places to use Buffer Access Strategy as a complete stand-in for
ring buffer.

> 4. Can you explain your choice in not just making < 128 a hard error
> rather than clamping?
>
> I guess it means checks like this are made more simple, but that does
> not seem like a good enough reason:
>
> /* check for out-of-bounds */
> if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
>
> postgres=# vacuum (parallel -1) pg_class;
> ERROR:  parallel workers for vacuum must be between 0 and 1024
>
> Maybe the above is a good guide to follow.
>
> To allow you to get rid of the clamping code, you'd likely need an
> assign hook function for vacuum_buffer_usage_limit.

I've added a check hook and replicated the same restrictions in
ExecVacuum() where it parses the limit. I have included enforcement of
the conditional limit that the ring cannot occupy more than 1/8 of
shared buffers. The immediate consequence of this was that my tests were
no longer stable (except for the integer overflow one).
I have removed them for now until I can come up with a better testing
strategy.

On the topic of testing, I also thought we should remove the
VACUUM(BUFFER_USAGE_LIMIT X, PARALLEL X) test. Though the parallel
workers do make their own strategy rings and such a test would be
covering some code, I am hesitant to write a test that would never
really fail. The observable behavior of not using a strategy will be
either 1) basically nothing or 2) the same for parallel and
non-parallel. What do you think?

> 5. I see vacuum.sgml is full of inconsistencies around the use of
> <literal> vs <option>. I was going to complain about your:
>
>       <literal>ONLY_DATABASE_STATS</literal> option. If
>       <literal>ANALYZE</literal> is also specified, the
>       <literal>BUFFER_USAGE_LIMIT</literal> value is used for both the vacuum
>
> but I see you've likely just copied what's nearby.
>
> There are also plenty of usages of <option> in that file.  I'd rather
> see you use <option>. Maybe there can be some other patch that sweeps
> the entire docs to look for <literal>OPTION_NAME</literal> and
> replaces them to use <option>.

I haven't done the separate sweep patch, but I have updated my own
usages in this set.

> 6. I was surprised to see you've added both
> GetAccessStrategyWithSize() and GetAccessStrategyWithNBuffers(). I
> think the former is suitable for both. GetAccessStrategyWithNBuffers()
> seems to be just used once outside of freelist.c

This has been updated and reorganized.

> 7. I don't think bas_nbuffers() is a good name for an external
> function.  StrategyGetBufferCount() seems better.

I've used this name.

> 8. I don't quite follow this comment:
>
> /*
> * TODO: should this be 0 so that we are sure that vacuum() never
> * allocates a new bstrategy for us, even if we pass in NULL for that
> * parameter? maybe could change how failsafe NULLs out bstrategy if
> * so?
> */
>
> Can you explain under what circumstances would vacuum() allocate a
> bstrategy when do_autovacuum() would not? Is this a case of a config
> reload where someone changes vacuum_buffer_usage_limit from 0 to
> something non-zero? If so, perhaps do_autovacuum() needs to detect
> this and allocate a strategy rather than having vacuum() do it once
> per table (wastefully).

Hmm. Yes, I started hacking on this, but I think it might be a bit
tricky to get right. I think it would make sense to check if
vacuum_buffer_usage_limit goes from 0 to not 0 or from not 0 to 0 and
allow disabling and enabling the buffer access strategy, however, I'm
not sure we want to allow changing the size during an autovacuum
worker's run. I started writing code to just allow enabling and
disabling, but I'm a little concerned that the distinction will be
difficult to understand for the user with no obvious indication of what
is happening. That is, you change the size and it silently does nothing,
but you set it to/from 0 and it silently does something?

One alternative for now is to save the ring size before looping through
the relations in do_autovacuum() and always restore that value in
tab->at_params.ring_size in table_recheck_autovac().

I'm not sure what to do.

> 9. buffer/README.  I think it might be overkill to document details
> about how the new vacuum option works in a section talking about
> Buffer Ring Replacement Strategy.  Perhaps it just worth something
> like:
>
> "In v16, the 256KB ring was made configurable by way of the
> vacuum_buffer_usage_limit GUC and the BUFFER_USAGE_LIMIT VACUUM
> option."

I've made the change you suggested.

> 10. I think if you do #4 then you can get rid of all the range checks
> and DEBUG1 elogs in GetAccessStrategyWithSize().

Done.

> 11. This seems a bit badly done:
>
> int vacuum_buffer_usage_limit = -1;
>
> int VacuumCostPageHit = 1; /* GUC parameters for vacuum */
> int VacuumCostPageMiss = 2;
> int VacuumCostPageDirty = 20;
>
> I'd class vacuum_buffer_usage_limit as a "GUC parameters for vacuum"
> too.  Probably the CamelCase naming should be followed too.

I've made this change.

> 12. ANALYZE too?
>
> {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
> gettext_noop("Sets the buffer pool size for VACUUM and autovacuum."),

I've mentioned this here and also added the option for ANALYZE.

> 13. VacuumParams.ring_size has no comments explaining what it is.

I've added one.

> 14. vacuum_buffer_usage_limit seems to be lumped in with unrelated GUCs
>
> extern PGDLLIMPORT int maintenance_work_mem;
> extern PGDLLIMPORT int max_parallel_maintenance_workers;
> +extern PGDLLIMPORT int vacuum_buffer_usage_limit;
>
> extern PGDLLIMPORT int VacuumCostPageHit;
> extern PGDLLIMPORT int VacuumCostPageMiss;

I've moved it down a line.

> 15. No comment explaining what these are:
>
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> #define MIN_BAS_RING_SIZE_KB 128

I've added one.

> 16. Parameter names in function declaration and definition don't match in:
>
> extern BufferAccessStrategy
> GetAccessStrategyWithNBuffers(BufferAccessStrategyType btype, int
> nbuffers);
> extern BufferAccessStrategy
> GetAccessStrategyWithSize(BufferAccessStrategyType btype, int
> nbuffers);

I've fixed this.

> Also, line wraps at 79 chars. (80 including line feed)

I've fixed that function prototype instance of it.

In general line wrap limit + pgindent can be quite challenging. I often
break something onto multiple lines to appease the line limit and then
pgindent will add an absurd number of tabs to align the second line in a
way that looks truly awful. I try to make local variables when this is a
problem, but it is often quite annoying to do that. I wish there was
some way to make pgindent do something different in these cases.

> 17. If you want to test the 16GB upper limit, maybe going 1KB (or
> 8KB?) rather than 1GB over 16GB is better? 2097153kB, I think.
>
> VACUUM (BUFFER_USAGE_LIMIT '17 GB') vac_option_tab;

I've removed this test for now until I figure out a way to actually hit
this reliably with different-sized shared buffers.

- Melanie
From bacae5fac52f64a903f2d35758d443112a8e30cd Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sun, 19 Mar 2023 18:00:28 -0400
Subject: [PATCH v10 2/2] Add buffer-usage-limit option to vacuumdb

---
 doc/src/sgml/ref/vacuumdb.sgml | 13 +++++++++++++
 src/bin/scripts/vacuumdb.c     | 23 +++++++++++++++++++++++
 src/include/commands/vacuum.h  |  3 +++
 3 files changed, 39 insertions(+)

diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml
index 74bac2d4ba..8280cf0fb0 100644
--- a/doc/src/sgml/ref/vacuumdb.sgml
+++ b/doc/src/sgml/ref/vacuumdb.sgml
@@ -278,6 +278,19 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>
 
+     <varlistentry>
+      <term><option>--buffer-usage-limit <replaceable class="parameter">buffer_usage_limit</replaceable></option></term>
+      <listitem>
+       <para>
+        Specifies the
+        <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
+        ring buffer size for a given invocation of <application>vacuumdb</application>.
+        This size is used to calculate the number of shared buffers which will
+        be reused as part of this strategy.  See <xref linkend="sql-vacuum"/>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-n <replaceable class="parameter">schema</replaceable></option></term>
       <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term>
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 39be265b5b..941eac7727 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -46,6 +46,7 @@ typedef struct vacuumingOptions
 	bool		process_main;
 	bool		process_toast;
 	bool		skip_database_stats;
+	char	   *buffer_usage_limit;
 } vacuumingOptions;
 
 /* object filter options */
@@ -123,6 +124,7 @@ main(int argc, char *argv[])
 		{"no-truncate", no_argument, NULL, 10},
 		{"no-process-toast", no_argument, NULL, 11},
 		{"no-process-main", no_argument, NULL, 12},
+		{"buffer-usage-limit", required_argument, NULL, 13},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -147,6 +149,7 @@ main(int argc, char *argv[])
 	/* initialize options */
 	memset(&vacopts, 0, sizeof(vacopts));
 	vacopts.parallel_workers = -1;
+	vacopts.buffer_usage_limit = NULL;
 	vacopts.no_index_cleanup = false;
 	vacopts.force_index_cleanup = false;
 	vacopts.do_truncate = true;
@@ -266,6 +269,9 @@ main(int argc, char *argv[])
 			case 12:
 				vacopts.process_main = false;
 				break;
+			case 13:
+				vacopts.buffer_usage_limit = pg_strdup(optarg);
+				break;
 			default:
 				/* getopt_long already emitted a complaint */
 				pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -343,6 +349,11 @@ main(int argc, char *argv[])
 		pg_fatal("cannot use the \"%s\" option with the \"%s\" option",
 				 "no-index-cleanup", "force-index-cleanup");
 
+	/* buffer-usage-limit doesn't make sense with VACUUM FULL */
+	if (vacopts.buffer_usage_limit && vacopts.full)
+		pg_fatal("cannot use the \"%s\" option with the \"%s\" option",
+				 "buffer-usage-limit", "full");
+
 	/* fill cparams except for dbname, which is set below */
 	cparams.pghost = host;
 	cparams.pgport = port;
@@ -550,6 +561,10 @@ vacuum_one_database(ConnParams *cparams,
 		pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
 				 "--parallel", "13");
 
+	if (vacopts->buffer_usage_limit && PQserverVersion(conn) < 160000)
+		pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
+				 "--buffer-usage-limit", "16");
+
 	/* skip_database_stats is used automatically if server supports it */
 	vacopts->skip_database_stats = (PQserverVersion(conn) >= 160000);
 
@@ -1048,6 +1063,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion,
 								  vacopts->parallel_workers);
 				sep = comma;
 			}
+			if (vacopts->buffer_usage_limit)
+			{
+				Assert(serverVersion >= 160000);
+				appendPQExpBuffer(sql, "%sBUFFER_USAGE_LIMIT '%s'", sep,
+								  vacopts->buffer_usage_limit);
+				sep = comma;
+			}
 			if (sep != paren)
 				appendPQExpBufferChar(sql, ')');
 		}
@@ -1111,6 +1133,7 @@ help(const char *progname)
 	printf(_("      --force-index-cleanup       always remove index entries that point to dead tuples\n"));
 	printf(_("  -j, --jobs=NUM                  use this many concurrent connections to vacuum\n"));
 	printf(_("      --min-mxid-age=MXID_AGE     minimum multixact ID age of tables to vacuum\n"));
+	printf(_("      --buffer-usage-limit=BUFSIZE size of ring buffer used for vacuum\n"));
 	printf(_("      --min-xid-age=XID_AGE       minimum transaction ID age of tables to vacuum\n"));
 	printf(_("      --no-index-cleanup          don't remove index entries that point to dead tuples\n"));
 	printf(_("      --no-process-main           skip the main relation\n"));
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 2ee5147e97..f027d3ed71 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -213,6 +213,9 @@ typedef enum VacOptValue
  *
  * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set
  * in options.
+ *
+ * When adding a new VacuumParam member, consider adding it to vacuumdb as
+ * well.
  */
 typedef struct VacuumParams
 {
-- 
2.37.2

From bab9269bde542fdfe82bce0171a5031a6c5afa83 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sun, 19 Mar 2023 18:00:08 -0400
Subject: [PATCH v10 1/2] Add VACUUM BUFFER_USAGE_LIMIT option and GUC

Add GUC, vacuum_buffer_usage_limit, and VACUUM option,
BUFFER_USAGE_LIMIT, through which the user can specify the maximum size
to use for buffers for VACUUM, ANALYZE, and autovacuum. The size is
converted into a number of shared buffers which are tracked in a
BufferAccessStrategyData object. The explicit VACUUM option, when
specified, overrides the GUC value, unless it is specified as -1.

Reviewed-by: David Rowley <dgrowle...@gmail.com>
Reviewed-by: Justin Pryzby <pry...@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de
---
 doc/src/sgml/config.sgml                      | 30 ++++++
 doc/src/sgml/ref/analyze.sgml                 | 26 ++++-
 doc/src/sgml/ref/vacuum.sgml                  | 22 +++++
 src/backend/commands/vacuum.c                 | 85 +++++++++++++++-
 src/backend/commands/vacuumparallel.c         | 14 ++-
 src/backend/postmaster/autovacuum.c           | 19 +++-
 src/backend/storage/buffer/README             |  4 +
 src/backend/storage/buffer/freelist.c         | 98 +++++++++++++++++--
 src/backend/utils/init/globals.c              |  5 +-
 src/backend/utils/misc/guc_tables.c           | 11 +++
 src/backend/utils/misc/postgresql.conf.sample |  4 +
 src/bin/psql/tab-complete.c                   |  4 +-
 src/include/commands/vacuum.h                 |  6 ++
 src/include/miscadmin.h                       |  2 +
 src/include/storage/bufmgr.h                  | 35 +++++++
 src/include/utils/guc_hooks.h                 |  3 +
 src/test/regress/expected/vacuum.out          |  4 +
 src/test/regress/sql/vacuum.sql               |  3 +
 18 files changed, 357 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bcc49aec45..7e6af71af3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2001,6 +2001,36 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-vacuum-buffer-usage-limit" xreflabel="vacuum_buffer_usage_limit">
+      <term>
+       <varname>vacuum_buffer_usage_limit</varname> (<type>integer</type>)
+       <indexterm>
+        <primary><varname>vacuum_buffer_usage_limit</varname> configuration parameter</primary>
+       </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Specifies the size of <varname>shared_buffers</varname> to be reused
+        for each backend participating in a given invocation of
+        <command>VACUUM</command> or <command>ANALYZE</command> or in
+        autovacuum.  This size is converted to the number of shared buffers
+        which will be reused as part of a
+        <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>.
+        <literal>0</literal> will disable use of a <literal>Buffer Access Strategy</literal>.
+        <literal>-1</literal> will set the size to a default of
+        <literal>256 KB</literal>.  The maximum size is
+        <literal>16 GB</literal>.  The minimum size is the lesser
+        of 1/8 the size of shared buffers and <literal>128 KB</literal>.  The
+        default value is <literal>-1</literal>. If this value is specified
+        without units, it is taken as kilobytes. This parameter can be set at
+        any time. It can be overridden for
+        <link linkend="sql-vacuum"><command>VACUUM</command></link> and
+        <link linkend="sql-analyze"><command>ANALYZE</command></link>
+        when passing the <option>BUFFER_USAGE_LIMIT</option> option.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-logical-decoding-work-mem" xreflabel="logical_decoding_work_mem">
       <term><varname>logical_decoding_work_mem</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2f94e89cb0..5b1fe1d860 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -28,6 +28,7 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
 
     VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
     SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    BUFFER_USAGE_LIMIT [ <replaceable class="parameter">string</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -51,9 +52,14 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
    Without a <replaceable class="parameter">table_and_columns</replaceable>
    list, <command>ANALYZE</command> processes every table and materialized view
    in the current database that the current user has permission to analyze.
-   With a list, <command>ANALYZE</command> processes only those table(s).
-   It is further possible to give a list of column names for a table,
-   in which case only the statistics for those columns are collected.
+   With a list, <command>ANALYZE</command> processes only those table(s). It is
+   further possible to give a list of column names for a table, in which case
+   only the statistics for those columns are collected.
+   <command>ANALYZE</command> uses a
+   <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
+   when reading in the sample data. The number of buffers consumed for this can
+   be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/> or by using
+   the <option>BUFFER_USAGE_LIMIT</option> option.
   </para>
 
   <para>
@@ -95,6 +101,20 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>BUFFER_USAGE_LIMIT</literal></term>
+    <listitem>
+     <para>
+      Specifies the
+      <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
+      ring buffer size for <command>ANALYZE</command>.  See the
+      <link linkend="sql-vacuum"><command>VACUUM</command></link> option with
+      the same name.
+     </para>
+    </listitem>
+   </varlistentry>
+
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index b6d30b5764..7ebac56498 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -39,6 +39,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     PARALLEL <replaceable class="parameter">integer</replaceable>
     SKIP_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
     ONLY_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
+    BUFFER_USAGE_LIMIT [ <replaceable class="parameter">string</replaceable> ]
 
 <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
 
@@ -345,6 +346,27 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><literal>BUFFER_USAGE_LIMIT</literal></term>
+    <listitem>
+     <para>
+      Specifies the
+      <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
+      ring buffer size for <command>VACUUM</command>.  This size is used to
+      calculate the number of shared buffers which will be reused as part of
+      this strategy. <literal>0</literal> disables use of a
+      <literal>Buffer Access Strategy</literal>.  <literal>-1</literal>
+      indicates that <command>VACUUM</command> should fall back to the value
+      specified by <xref linkend="guc-vacuum-buffer-usage-limit"/>.
+      This option can't be used with the <option>FULL</option> option or
+      <option>ONLY_DATABASE_STATS</option> option.  If
+      <option>ANALYZE</option> is also specified, the
+      <option>BUFFER_USAGE_LIMIT</option> value is used for both the vacuum
+      and analyze stages.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><replaceable class="parameter">boolean</replaceable></term>
     <listitem>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index da85330ef4..22365ee473 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -124,6 +124,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	/* By default parallel vacuum is enabled */
 	params.nworkers = 0;
 
+	/* by default use buffer access strategy with default size */
+	params.ring_size = -1;
+
 	/* Parse options list */
 	foreach(lc, vacstmt->options)
 	{
@@ -134,6 +137,65 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 			verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "skip_locked") == 0)
 			skip_locked = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "buffer_usage_limit") == 0)
+		{
+			const char *hintmsg;
+			int			result;
+			int			sb_limit_kb;
+			char	   *vac_buffer_size;
+			int			blcksz_kb = BLCKSZ / 1024;
+
+			if (opt->arg == NULL)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_SYNTAX_ERROR),
+						 errmsg("buffer_usage_limit option requires a valid value"),
+						 parser_errposition(pstate, opt->location)));
+			}
+
+			vac_buffer_size = defGetString(opt);
+
+			if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, &hintmsg))
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("value: \"%s\": is invalid for buffer_usage_limit",
+								vac_buffer_size),
+						 hintmsg ? errhint("%s", _(hintmsg)) : 0));
+			}
+
+			/* 0 and -1 are okay */
+			if (result != -1 && result != 0)
+			{
+				sb_limit_kb = StrategyGetClampedBufsize(result);
+
+				/*
+				 * Check that specified size does not exceed allowed fraction
+				 * of shared buffers.
+				 */
+				if (result != sb_limit_kb)
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+							 errmsg("buffer_usage_limit for vacuum cannot exceed %d KB because shared buffers is %d KB",
+									sb_limit_kb, NBuffers * blcksz_kb)));
+				}
+
+				/*
+				 * Check that specified size falls within the hard upper and
+				 * lower limits.
+				 */
+				if (result < MIN_BAS_RING_SIZE_KB || result > MAX_BAS_RING_SIZE_KB)
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+							 errmsg("buffer_usage_limit for a vacuum must be between %d KB and %d KB",
+									MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB)));
+				}
+			}
+
+			params.ring_size = result;
+		}
 		else if (!vacstmt->is_vacuumcmd)
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -238,6 +300,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("VACUUM FULL cannot be performed in parallel")));
 
+	if ((params.options & VACOPT_FULL) && params.ring_size > -1)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));
+
+	if ((params.options & VACOPT_ONLY_DATABASE_STATS) && params.ring_size > -1)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM ONLY_DATABASE_STATS")));
+
 	/*
 	 * Make sure VACOPT_ANALYZE is specified if any column lists are present.
 	 */
@@ -401,7 +473,18 @@ vacuum(List *relations, VacuumParams *params,
 	{
 		MemoryContext old_context = MemoryContextSwitchTo(vac_context);
 
-		bstrategy = GetAccessStrategy(BAS_VACUUM);
+		Assert(params->ring_size >= -1);
+
+		if (params->ring_size == -1)
+		{
+			if (VacuumBufferUsageLimit == -1)
+				bstrategy = GetAccessStrategy(BAS_VACUUM);
+			else
+				bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
+		}
+		else
+			bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);
+
 		MemoryContextSwitchTo(old_context);
 	}
 
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 563117a8f6..614a35779a 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -87,6 +87,12 @@ typedef struct PVShared
 	 */
 	int			maintenance_work_mem_worker;
 
+	/*
+	 * The number of buffers each worker's Buffer Access Strategy ring should
+	 * contain.
+	 */
+	int			ring_nbuffers;
+
 	/*
 	 * Shared vacuum cost balance.  During parallel vacuum,
 	 * VacuumSharedCostBalance points to this value and it accumulates the
@@ -365,6 +371,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
 		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
 		maintenance_work_mem;
 
+	/* Use the same buffer size for all workers */
+	shared->ring_nbuffers = StrategyGetBufferCount(bstrategy);
+
 	pg_atomic_init_u32(&(shared->cost_balance), 0);
 	pg_atomic_init_u32(&(shared->active_nworkers), 0);
 	pg_atomic_init_u32(&(shared->idx), 0);
@@ -1018,8 +1027,9 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	pvs.indname = NULL;
 	pvs.status = PARALLEL_INDVAC_STATUS_INITIAL;
 
-	/* Each parallel VACUUM worker gets its own access strategy */
-	pvs.bstrategy = GetAccessStrategy(BAS_VACUUM);
+	/* Each parallel VACUUM worker gets its own access strategy. */
+	pvs.bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
+											  shared->ring_nbuffers * (BLCKSZ / 1024));
 
 	/* Setup error traceback support for ereport() */
 	errcallback.callback = parallel_vacuum_error_callback;
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..9b5ee32cb0 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2290,9 +2290,15 @@ do_autovacuum(void)
 	/*
 	 * Create a buffer access strategy object for VACUUM to use.  We want to
 	 * use the same one across all the vacuum operations we perform, since the
-	 * point is for VACUUM not to blow out the shared cache.
+	 * point is for VACUUM not to blow out the shared cache. If we later enter
+	 * failsafe mode, we will cease use of the BufferAccessStrategy. Either
+	 * way, we clean up the BufferAccessStrategy object at the end of this
+	 * function.
 	 */
-	bstrategy = GetAccessStrategy(BAS_VACUUM);
+	if (VacuumBufferUsageLimit == -1)
+		bstrategy = GetAccessStrategy(BAS_VACUUM);
+	else
+		bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
 
 	/*
 	 * create a memory context to act as fake PortalContext, so that the
@@ -2884,6 +2890,15 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
+
+		/*
+		 * TODO: should this be 0 so that we are sure that vacuum() never
+		 * allocates a new bstrategy for us, even if we pass in NULL for that
+		 * parameter? maybe could change how failsafe NULLs out bstrategy if
+		 * so?
+		 */
+		tab->at_params.ring_size = VacuumBufferUsageLimit;
+
 		tab->at_vacuum_cost_limit = vac_cost_limit;
 		tab->at_vacuum_cost_delay = vac_cost_delay;
 		tab->at_relname = NULL;
diff --git a/src/backend/storage/buffer/README b/src/backend/storage/buffer/README
index a775276ff2..d7c9d99af2 100644
--- a/src/backend/storage/buffer/README
+++ b/src/backend/storage/buffer/README
@@ -236,6 +236,10 @@ buffers were sent to the freelist, which was effectively a buffer ring of 1
 buffer, resulting in excessive WAL flushing.  Allowing VACUUM to update
 256KB between WAL flushes should be more efficient.
 
+In v16, the 256KB ring was made configurable by way of the
+vacuum_buffer_usage_limit GUC and the BUFFER_USAGE_LIMIT option to VACUUM and
+ANALYZE.
+
 Bulk writes work similarly to VACUUM.  Currently this applies only to
 COPY IN and CREATE TABLE AS SELECT.  (Might it be interesting to make
 seqscan UPDATE and DELETE use the bulkwrite strategy?)  For bulk writes
diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index f122709fbe..ae777e9cd5 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -20,6 +20,7 @@
 #include "storage/buf_internals.h"
 #include "storage/bufmgr.h"
 #include "storage/proc.h"
+#include "utils/guc_hooks.h"
 
 #define INT_ACCESS_ONCE(var)	((int)(*((volatile int *)&(var))))
 
@@ -531,6 +532,42 @@ StrategyInitialize(bool init)
  * ----------------------------------------------------------------
  */
 
+/*
+ * GUC check function to ensure GUC value specified is within the allowable
+ * range. This should match the checking in ExecVacuum().
+ */
+bool
+check_vacuum_buffer_usage_limit(int *newval, void **extra,
+								GucSource source)
+{
+	int			sb_limit_kb;
+
+	int			blcksz_kb = BLCKSZ / 1024;
+
+	/* Allow specifying the default or disabling Buffer Access Strategy */
+	if (*newval == -1 || *newval == 0)
+		return true;
+
+	/* Make sure ring isn't an undue fraction of shared buffers */
+	sb_limit_kb = StrategyGetClampedBufsize(*newval);
+
+	if (*newval != sb_limit_kb)
+	{
+		GUC_check_errdetail("\"vacuum_buffer_usage_limit\" value cannot exceed %d KB because shared_buffers is %d KB",
+							sb_limit_kb, NBuffers * blcksz_kb);
+		return false;
+	}
+
+	/* Value upper and lower hard limits are inclusive */
+	if (*newval >= MIN_BAS_RING_SIZE_KB && *newval <= MAX_BAS_RING_SIZE_KB)
+		return true;
+
+	/* Value does not fall within any allowable range */
+	GUC_check_errdetail("\"vacuum_buffer_usage_limit\" must be between %d KB and %d KB",
+						MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB);
+
+	return false;
+}
 
 /*
  * GetAccessStrategy -- create a BufferAccessStrategy object
@@ -540,8 +577,7 @@ StrategyInitialize(bool init)
 BufferAccessStrategy
 GetAccessStrategy(BufferAccessStrategyType btype)
 {
-	BufferAccessStrategy strategy;
-	int			nbuffers;
+	int			ring_size_kb;
 
 	/*
 	 * Select ring size to use.  See buffer/README for rationales.
@@ -556,13 +592,13 @@ GetAccessStrategy(BufferAccessStrategyType btype)
 			return NULL;
 
 		case BAS_BULKREAD:
-			nbuffers = 256 * 1024 / BLCKSZ;
+			ring_size_kb = 256;
 			break;
 		case BAS_BULKWRITE:
-			nbuffers = 16 * 1024 * 1024 / BLCKSZ;
+			ring_size_kb = 16 * 1024;
 			break;
 		case BAS_VACUUM:
-			nbuffers = 256 * 1024 / BLCKSZ;
+			ring_size_kb = 256;
 			break;
 
 		default:
@@ -571,8 +607,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
 			return NULL;		/* keep compiler quiet */
 	}
 
-	/* Make sure ring isn't an undue fraction of shared buffers */
-	nbuffers = Min(NBuffers / 8, nbuffers);
+
+	return GetAccessStrategyWithSize(btype,
+									 StrategyGetClampedBufsize(ring_size_kb));
+}
+
+
+/*
+ * Helper to convert a size to a number of buffers.
+ */
+static inline int
+bufsize_limit_to_nbuffers(int bufsize_limit_kb)
+{
+	int			blcksz_kb = BLCKSZ / 1024;
+
+	Assert(blcksz_kb > 0);
+
+	return bufsize_limit_kb / blcksz_kb;
+}
+
+/*
+ * GetAccessStrategyWithSize -- create a BufferAccessStrategy object with a
+ * 						number of buffers equivalent to the passed in size
+ */
+BufferAccessStrategy
+GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size_kb)
+{
+	int			nbuffers;
+	BufferAccessStrategy strategy;
+
+	/* Default nbuffers should have resulted in calling GetAccessStrategy() */
+	Assert(ring_size_kb >= 0);
+
+	if (ring_size_kb == 0)
+		return NULL;
+
+	nbuffers = bufsize_limit_to_nbuffers(ring_size_kb);
+
+	Assert(nbuffers <= NBuffers / 8 &&
+		   nbuffers > 0);
 
 	/* Allocate the object and initialize all elements to zeroes */
 	strategy = (BufferAccessStrategy)
@@ -586,6 +659,17 @@ GetAccessStrategy(BufferAccessStrategyType btype)
 	return strategy;
 }
 
+
+/*
+ * StrategyGetBufferCount -- an accessor for the number of buffers in the ring
+ */
+int
+StrategyGetBufferCount(BufferAccessStrategy strategy)
+{
+	return strategy->nbuffers;
+}
+
+
 /*
  * FreeAccessStrategy -- release a BufferAccessStrategy object
  *
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..059a097bef 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -139,7 +139,10 @@ int			max_worker_processes = 8;
 int			max_parallel_workers = 8;
 int			MaxBackends = 0;
 
-int			VacuumCostPageHit = 1;	/* GUC parameters for vacuum */
+/* GUC parameters for vacuum */
+int			VacuumBufferUsageLimit = -1;
+
+int			VacuumCostPageHit = 1;
 int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8062589efd..75e67cd74c 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2224,6 +2224,17 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
+			gettext_noop("Sets the buffer pool size for VACUUM, ANALYZE, and autovacuum."),
+			NULL,
+			GUC_UNIT_KB
+		},
+		&VacuumBufferUsageLimit,
+		-1, -1, MAX_BAS_RING_SIZE_KB,
+		check_vacuum_buffer_usage_limit, NULL, NULL
+	},
+
 	{
 		{"shared_memory_size", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Shows the size of the server's main shared memory area (rounded up to the nearest MB)."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index ee49ca3937..b962f500dc 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -157,6 +157,10 @@
 					#   mmap
 					# (change requires restart)
 #min_dynamic_shared_memory = 0MB	# (change requires restart)
+#vacuum_buffer_usage_limit = -1 # size of vacuum and analyze buffer access strategy ring.
+				# -1 to use default,
+				# 0 to disable vacuum buffer access strategy
+				# > 0 to specify size
 
 # - Disk -
 
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e38a49e8bd..6614fd2e62 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2662,7 +2662,7 @@ psql_completion(const char *text, int start, int end)
 		 * one word, so the above test is correct.
 		 */
 		if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
-			COMPLETE_WITH("VERBOSE", "SKIP_LOCKED");
+			COMPLETE_WITH("VERBOSE", "SKIP_LOCKED", "BUFFER_USAGE_LIMIT");
 		else if (TailMatches("VERBOSE|SKIP_LOCKED"))
 			COMPLETE_WITH("ON", "OFF");
 	}
@@ -4620,7 +4620,7 @@ psql_completion(const char *text, int start, int end)
 						  "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED",
 						  "INDEX_CLEANUP", "PROCESS_MAIN", "PROCESS_TOAST",
 						  "TRUNCATE", "PARALLEL", "SKIP_DATABASE_STATS",
-						  "ONLY_DATABASE_STATS");
+						  "ONLY_DATABASE_STATS", "BUFFER_USAGE_LIMIT");
 		else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|PROCESS_MAIN|PROCESS_TOAST|TRUNCATE|SKIP_DATABASE_STATS|ONLY_DATABASE_STATS"))
 			COMPLETE_WITH("ON", "OFF");
 		else if (TailMatches("INDEX_CLEANUP"))
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..2ee5147e97 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -236,6 +236,12 @@ typedef struct VacuumParams
 	 * disabled.
 	 */
 	int			nworkers;
+
+	/*
+	 * The size in KB of the Buffer Access Strategy ring to be used for VACUUM
+	 * and ANALYZE.
+	 */
+	int			ring_size;
 } VacuumParams;
 
 /*
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..03f4786fb2 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -263,6 +263,8 @@ extern PGDLLIMPORT double hash_mem_multiplier;
 extern PGDLLIMPORT int maintenance_work_mem;
 extern PGDLLIMPORT int max_parallel_maintenance_workers;
 
+extern PGDLLIMPORT int VacuumBufferUsageLimit;
+
 extern PGDLLIMPORT int VacuumCostPageHit;
 extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 73762cb1ec..f3a12b9277 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -101,6 +101,14 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 /* upper limit for effective_io_concurrency */
 #define MAX_IO_CONCURRENCY 1000
 
+/*
+ * Upper and lower hard limits for the Buffer Access Strategy ring size
+ * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT option
+ * to VACUUM and ANALYZE.
+ */
+#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
+#define MIN_BAS_RING_SIZE_KB 128
+
 /* special block number for ReadBuffer() */
 #define P_NEW	InvalidBlockNumber	/* grow the file to get a new page */
 
@@ -194,7 +202,13 @@ extern Size BufferShmemSize(void);
 extern void AtProcExit_LocalBuffers(void);
 
 /* in freelist.c */
+
 extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype);
+extern BufferAccessStrategy GetAccessStrategyWithSize(
+													  BufferAccessStrategyType btype,
+													  int ring_size_kb);
+extern int	StrategyGetBufferCount(BufferAccessStrategy strategy);
+
 extern void FreeAccessStrategy(BufferAccessStrategy strategy);
 
 
@@ -286,6 +300,27 @@ BufferGetPage(Buffer buffer)
 	return (Page) BufferGetBlock(buffer);
 }
 
+/*
+ * StrategyGetClampedBufsize
+ * 		Returns a clamped buffer size in KB
+ *
+ * Buffer Access Strategies should not consume more than the 1/8 of shared
+ * buffers. Given a size in KB, this helper calculates an acceptable clamped
+ * value, given the current size of shared buffers.
+ */
+static inline int
+StrategyGetClampedBufsize(int bufsize_kb)
+{
+	int			sb_limit_kb;
+	int			blcksz_kb = BLCKSZ / 1024;
+
+	Assert(blcksz_kb > 0);
+
+	sb_limit_kb = NBuffers / 8 * blcksz_kb;
+
+	return Min(sb_limit_kb, bufsize_kb);
+}
+
 /*
  * Check whether the given snapshot is too old to have safely read the given
  * page from the given table.  If so, throw a "snapshot too old" error.
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index aeb3663071..1893292634 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -33,6 +33,9 @@ extern bool check_autovacuum_max_workers(int *newval, void **extra,
 										 GucSource source);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
+
+extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
+											GucSource source);
 extern bool check_backtrace_functions(char **newval, void **extra,
 									  GucSource source);
 extern void assign_backtrace_functions(const char *newval, void *extra);
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index e5a312182e..fe675c5a2e 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -350,6 +350,10 @@ SELECT t.relfilenode = :toast_filenode AS is_same_toast_filenode
  f
 (1 row)
 
+-- BUFFER_USAGE_LIMIT integer overflow error
+VACUUM (BUFFER_USAGE_LIMIT 10000000000) vac_option_tab;
+ERROR:  value: "10000000000": is invalid for buffer_usage_limit
+HINT:  Value exceeds integer range.
 -- SKIP_DATABASE_STATS option
 VACUUM (SKIP_DATABASE_STATS) vactst;
 -- ONLY_DATABASE_STATS option
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index a1fad43657..fc068b7705 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -272,6 +272,9 @@ SELECT t.relfilenode = :toast_filenode AS is_same_toast_filenode
   FROM pg_class c, pg_class t
   WHERE c.reltoastrelid = t.oid AND c.relname = 'vac_option_tab';
 
+-- BUFFER_USAGE_LIMIT integer overflow error
+VACUUM (BUFFER_USAGE_LIMIT 10000000000) vac_option_tab;
+
 -- SKIP_DATABASE_STATS option
 VACUUM (SKIP_DATABASE_STATS) vactst;
 
-- 
2.37.2

Reply via email to