Re: POC: Parallel processing of indexes in autovacuum

Daniil Davydov Sat, 22 Nov 2025 12:13:35 -0800

Hi,

On Sat, Nov 1, 2025 at 3:03 AM Masahiko Sawada <[email protected]> wrote:
>
> On Tue, Oct 28, 2025 at 6:10 AM Daniil Davydov <[email protected]> wrote:
> >
> > I'll allow it for a/v leader. I've also thought about 
> > "compute_parallel_delay".
> > The simplest solution that I see is to move cost-based delay parameters to
> > shared state (PVShared) and create some variables such a
> > VacuumSharedCostBalance, so we can use them inside vacuum_delay_point.
> > What do you think about this idea?
>
> I think that we need to somehow have parallel workers use the new
> vacuum delay parameters (e.g., VacuumCostPageHit and
> VacuumCostPageMiss) after the leader reloads the configuration file.
> The leader shares the initial parameters with the parallel workers
> (via DSM) before starting the workers but doesn't propagate the
> updates during the parallel operations. And the worker doesn't reload
> the configuration file.


I'm still working on it.

> Here are some review comments for 0001 patch:
>
> +static void
> +autovacuum_worker_before_shmem_exit(int code, Datum arg)
> +{
> +        if (code != 0)
> +                AutoVacuumReleaseAllParallelWorkers();
> +}
> +
>
> AutoVacuumReleaseAllParallelWorkers() calls
> AutoVacuumReleaseParallelWorkers() only when av_nworkers_reserved > 0,
> so I think we don't need the condition 'if (code != 0)' here.

Yeah, I wrote it more like a hint for the reader - "we should call
this function only
if the process is exiting due to an error". But actually it is not
necessary condition.

>
> ---
>  +extern void AutoVacuumReleaseAllParallelWorkers(void);
>
> There is no caller of this function outside of autovacuum.h.
>

I will fix it.

> ---
> { name => 'autovacuum_max_parallel_workers', type => 'int', context =>
> 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
>   short_desc => 'Maximum number of parallel autovacuum workers, that
> can be taken from bgworkers pool.',
>   long_desc => 'This parameter is capped by "max_worker_processes"
> (not by "autovacuum_max_workers"!).',
>   variable => 'autovacuum_max_parallel_workers',
>   boot_val => '0',
>   min => '0',
>   max => 'MAX_BACKENDS',
> },
>
> Parallel vacuum in autovacuum can be used only when users set the
> autovacuum_parallel_workers storage parameter. How about using the
> default value 2 for autovacuum_max_parallel_workers GUC parameter?
>

Sounds reasonable, +1 for it.


On Fri, Nov 21, 2025 at 2:31 AM Sami Imseih <[email protected]> wrote:
>
> Hi,
>
> I started to review this patch set again, and it needed rebasing, so I
> went ahead and did that.

Thanks for the review and rebasing the patch!

>
> I also have some comments:
>
> #1
> In AutoVacuumReserveParallelWorkers()
> I think here we should assert:
>
> ```
>     Assert(nworkers <= AutoVacuumShmem->av_freeParallelWorkers);
> ```
> prior to:
> ```
> +       AutoVacuumShmem->av_freeParallelWorkers -= nworkers;
> ```
>
> We are capping nworkers earlier in parallel_vacuum_compute_workers()
>
> ```
>   /* Cap by GUC variable */
>   parallel_workers = Min(parallel_workers, max_workers);
> ```
>
> so the assert will safe-guard against someone making a faulty change
> in parallel_vacuum_compute_workers()
>

Hm, I guess it is just a bug. We should reduce av_freeParallelWorkers by the
computed 'nreserved/ value (thus, we don't need any assertion). I'll fix it.

>
> #2
> In
> parallel_vacuum_process_all_indexes()
>
> ```
> +       /*
> +        * Reserve workers in autovacuum global state. Note, that we
> may be given
> +        * fewer workers than we requested.
> +        */
> +       if (AmAutoVacuumWorkerProcess() && nworkers > 0)
> +               nworkers = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> nworkers has a double meaning. The return value of
> AutoVacuumReserveParallelWorkers
> is nreserved. I think this should be
>
> ```
> nreserved = AutoVacuumReserveParallelWorkers(nworkers);
> ```
>
> and nreserved becomes the authoritative value for the number of parallel
> workers after that point.

Reserving parallel workers is specific for an autovacuum. If we add
'nreserved' variable, we would have to change all conditions below in
order not to break maintenance parallel vacuum. I think it will be confusing :
***
if (nworkers > 0 || (AmAutoVacuumWorkerProcess() && nreserved > 0))
***

Moreover, 'nworkers' reflects how many workers will be involved in vacuuming,
and I think that capping it by 'nreserved' is not breaking this semantic.

>
> #3
> I noticed in the logging:
>
> ```
> 2025-11-20 18:44:09.252 UTC [36787] LOG:  automatic vacuum of table
> "test.public.t": index scans: 0
>         workers usage statistics for all of index scans : launched in
> total = 3, planned in total = 3
>         pages: 0 removed, 503306 remain, 14442 scanned (2.87% of
> total), 0 eagerly scanned
>         tuples: 101622 removed, 7557074 remain, 0 are dead but not yet 
> removable
>         removable cutoff: 1711, which was 1 XIDs old when operation ended
>         frozen: 4793 pages from table (0.95% of total) had 98303 tuples frozen
>         visibility map: 4822 pages set all-visible, 4745 pages set
> all-frozen (0 were all-visible)
>         index scan bypassed: 8884 pages from table (1.77% of total)
> have 195512 dead item identifiers
> ```
>
> that even though index scan was bypased, we still launched parallel
> workers. I didn't dig deep into this,
> but that looks wrong. what do you think?
>

We can do both index vacuuming and index cleanup in parallel. I guess that
in your situation the vacuum was bypassed, but cleanup was called.

> #4
> instead of:
>
> "workers usage statistics for all of index scans : launched in total =
> 0, planned in total = 0"
>
> how about:
>
> "parallel index scan : workers planned = 0, workers launched = 0"
>
> also log this after the "index scan needed:" line; so it looks like
> this. What do you think>
>
> ```
>   index scan needed: 13211 pages from table (2.63% of total) had
> 289482 dead item identifiers removed
>   parallel index scan : workers planned = 0, workers launched = 0
>   index "t_pkey": pages: 25234 in total, 0 newly deleted, 0 currently
> deleted, 0 reusable
>   index "t_c1_idx": pages: 10219 in total, 0 newly deleted, 0
> currently deleted, 0 reusable
> ```

Agree, it looks better.


Thanks everybody for the comments!
Please, see v15 patches.

--
Best regards,
Daniil Davydov

From a867a0ffb18549b493412d6bc079df6aef9b92a4 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 02:32:44 +0700
Subject: [PATCH v15 4/4] Documentation for parallel autovacuum

---
 doc/src/sgml/config.sgml           | 18 ++++++++++++++++++
 doc/src/sgml/maintenance.sgml      | 12 ++++++++++++
 doc/src/sgml/ref/create_table.sgml | 20 ++++++++++++++++++++
 3 files changed, 50 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 023b3f03ba9..0f7096c2b5f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2841,6 +2841,7 @@ include_dir 'conf.d'
         <para>
          When changing this value, consider also adjusting
          <xref linkend="guc-max-parallel-workers"/>,
+         <xref linkend="guc-autovacuum-max-parallel-workers"/>,
          <xref linkend="guc-max-parallel-maintenance-workers"/>, and
          <xref linkend="guc-max-parallel-workers-per-gather"/>.
         </para>
@@ -9264,6 +9265,23 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        </listitem>
       </varlistentry>
 
+      <varlistentry id="guc-autovacuum-max-parallel-workers" xreflabel="autovacuum_max_parallel_workers">
+        <term><varname>autovacuum_max_parallel_workers</varname> (<type>integer</type>)
+        <indexterm>
+         <primary><varname>autovacuum_max_parallel_workers</varname></primary>
+         <secondary>configuration parameter</secondary>
+        </indexterm>
+        </term>
+        <listitem>
+         <para>
+          Sets the maximum number of parallel autovacuum workers that
+          can be used for parallel index vacuuming at one time. Is capped by
+          <xref linkend="guc-max-worker-processes"/>. The default is 0,
+          which means no parallel index vacuuming.
+         </para>
+        </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
 
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index f4f0433ef6f..02f306bbb8a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -897,6 +897,18 @@ HINT:  Execute a database-wide VACUUM in that database.
     autovacuum workers' activity.
    </para>
 
+   <para>
+    If an autovacuum worker process comes across a table with the enabled
+    <xref linkend="reloption-autovacuum-parallel-workers"/> storage parameter,
+    it will launch parallel workers in order to vacuum indexes of this table
+    in a parallel mode. Parallel workers are taken from the pool of processes
+    established by <xref linkend="guc-max-worker-processes"/>, limited by
+    <xref linkend="guc-max-parallel-workers"/>.
+    The total number of parallel autovacuum workers that can be active at one
+    time is limited by the <xref linkend="guc-autovacuum-max-parallel-workers"/>
+    configuration parameter.
+   </para>
+
    <para>
     If several large tables all become eligible for vacuuming in a short
     amount of time, all autovacuum workers might become occupied with
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 6557c5cffd8..e95a6488c5e 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1717,6 +1717,26 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
     </listitem>
    </varlistentry>
 
+  <varlistentry id="reloption-autovacuum-parallel-workers" xreflabel="autovacuum_parallel_workers">
+    <term><literal>autovacuum_parallel_workers</literal> (<type>integer</type>)
+    <indexterm>
+     <primary><varname>autovacuum_parallel_workers</varname> storage parameter</primary>
+    </indexterm>
+    </term>
+    <listitem>
+     <para>
+      Sets the maximum number of parallel autovacuum workers that can process
+      indexes of this table.
+      The default value is -1, which means no parallel index vacuuming for
+      this table. If value is 0 then parallel degree will computed based on
+      number of indexes.
+      Note that the computed number of workers may not actually be available at
+      run time. If this occurs, the autovacuum will run with fewer workers
+      than expected.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry id="reloption-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold">
     <term><literal>autovacuum_vacuum_threshold</literal>, <literal>toast.autovacuum_vacuum_threshold</literal> (<type>integer</type>)
     <indexterm>
-- 
2.43.0

From d10e3e0edd1f17ceabe8b12f780827ae0c9b686d Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:07:47 +0700
Subject: [PATCH v15 2/4] Logging for parallel autovacuum

---
 src/backend/access/heap/vacuumlazy.c  | 27 +++++++++++++++++++++++++--
 src/backend/commands/vacuumparallel.c | 20 ++++++++++++++------
 src/include/commands/vacuum.h         | 16 ++++++++++++++--
 src/tools/pgindent/typedefs.list      |  1 +
 4 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65bb0568a86..ea7a18d4d51 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -347,6 +347,12 @@ typedef struct LVRelState
 
 	/* Instrumentation counters */
 	int			num_index_scans;
+
+	/*
+	 * Number of planned and actually launched parallel workers for all index
+	 * scans, or NULL
+	 */
+	PVWorkersUsage *workers_usage;
 	/* Counters that follow are only for scanned_pages */
 	int64		tuples_deleted; /* # deleted from table */
 	int64		tuples_frozen;	/* # newly frozen */
@@ -700,6 +706,16 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
 		indnames = palloc(sizeof(char *) * vacrel->nindexes);
 		for (int i = 0; i < vacrel->nindexes; i++)
 			indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
+
+		/*
+		 * Allocate space for workers usage statistics. Thus, we explicitly
+		 * make clear that such statistics must be accumulated. For now, this
+		 * is used only by autovacuum leader worker, because it must log it in
+		 * the end of table processing.
+		 */
+		vacrel->workers_usage = AmAutoVacuumWorkerProcess() ?
+			(PVWorkersUsage *) palloc0(sizeof(PVWorkersUsage)) :
+			NULL;
 	}
 
 	/*
@@ -1099,6 +1115,11 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
 							 orig_rel_pages == 0 ? 100.0 :
 							 100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
 							 vacrel->lpdead_items);
+			if (vacrel->workers_usage)
+				appendStringInfo(&buf,
+								 _("parallel index vacuum/cleanup : workers planned = %d, workers launched = %d\n"),
+								 vacrel->workers_usage->nplanned,
+								 vacrel->workers_usage->nlaunched);
 			for (int i = 0; i < vacrel->nindexes; i++)
 			{
 				IndexBulkDeleteResult *istat = vacrel->indstats[i];
@@ -2659,7 +2680,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	{
 		/* Outsource everything to parallel variant */
 		parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
-											vacrel->num_index_scans);
+											vacrel->num_index_scans,
+											vacrel->workers_usage);
 
 		/*
 		 * Do a postcheck to consider applying wraparound failsafe now.  Note
@@ -3091,7 +3113,8 @@ lazy_cleanup_all_indexes(LVRelState *vacrel)
 		/* Outsource everything to parallel variant */
 		parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
 											vacrel->num_index_scans,
-											estimated_count);
+											estimated_count,
+											vacrel->workers_usage);
 	}
 
 	/* Reset the progress counters */
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index acd53b85b1c..9a258238650 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -227,7 +227,7 @@ struct ParallelVacuumState
 static int	parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 											bool *will_parallel_vacuum);
 static void parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
-												bool vacuum);
+												bool vacuum, PVWorkersUsage *wusage);
 static void parallel_vacuum_process_safe_indexes(ParallelVacuumState *pvs);
 static void parallel_vacuum_process_unsafe_indexes(ParallelVacuumState *pvs);
 static void parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
@@ -502,7 +502,7 @@ parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
  */
 void
 parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
-									int num_index_scans)
+									int num_index_scans, PVWorkersUsage *wusage)
 {
 	Assert(!IsParallelWorker());
 
@@ -513,7 +513,7 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
 	pvs->shared->reltuples = num_table_tuples;
 	pvs->shared->estimated_count = true;
 
-	parallel_vacuum_process_all_indexes(pvs, num_index_scans, true);
+	parallel_vacuum_process_all_indexes(pvs, num_index_scans, true, wusage);
 }
 
 /*
@@ -521,7 +521,8 @@ parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tup
  */
 void
 parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples,
-									int num_index_scans, bool estimated_count)
+									int num_index_scans, bool estimated_count,
+									PVWorkersUsage *wusage)
 {
 	Assert(!IsParallelWorker());
 
@@ -533,7 +534,7 @@ parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tup
 	pvs->shared->reltuples = num_table_tuples;
 	pvs->shared->estimated_count = estimated_count;
 
-	parallel_vacuum_process_all_indexes(pvs, num_index_scans, false);
+	parallel_vacuum_process_all_indexes(pvs, num_index_scans, false, wusage);
 }
 
 /*
@@ -618,7 +619,7 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
  */
 static void
 parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
-									bool vacuum)
+									bool vacuum, PVWorkersUsage *wusage)
 {
 	int			nworkers;
 	PVIndVacStatus new_status;
@@ -742,6 +743,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 									 "launched %d parallel vacuum workers for index cleanup (planned: %d)",
 									 pvs->pcxt->nworkers_launched),
 							pvs->pcxt->nworkers_launched, nworkers)));
+
+		/* Remember these values, if we asked to. */
+		if (wusage != NULL)
+		{
+			wusage->nlaunched += pvs->pcxt->nworkers_launched;
+			wusage->nplanned += nworkers;
+		}
 	}
 
 	/* Vacuum the indexes that can be processed by only leader process */
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1f3290c7fbf..90709ca3107 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -300,6 +300,16 @@ typedef struct VacDeadItemsInfo
 	int64		num_items;		/* current # of entries */
 } VacDeadItemsInfo;
 
+/*
+ * PVWorkersUsage stores information about total number of launched and planned
+ * workers during parallel vacuum.
+ */
+typedef struct PVWorkersUsage
+{
+	int			nlaunched;
+	int			nplanned;
+} PVWorkersUsage;
+
 /* GUC parameters */
 extern PGDLLIMPORT int default_statistics_target;	/* PGDLLIMPORT for PostGIS */
 extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -394,11 +404,13 @@ extern TidStore *parallel_vacuum_get_dead_items(ParallelVacuumState *pvs,
 extern void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs);
 extern void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs,
 												long num_table_tuples,
-												int num_index_scans);
+												int num_index_scans,
+												PVWorkersUsage *wusage);
 extern void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs,
 												long num_table_tuples,
 												int num_index_scans,
-												bool estimated_count);
+												bool estimated_count,
+												PVWorkersUsage *wusage);
 extern void parallel_vacuum_main(dsm_segment *seg, shm_toc *toc);
 
 /* in commands/analyze.c */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 27a4d131897..a838b0885c6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ PullFilterOps
 PushFilter
 PushFilterOps
 PushFunction
+PVWorkersUsage
 PyCFunction
 PyMethodDef
 PyModuleDef
-- 
2.43.0

From 267641b1832f011a32b8f870dd1794d0a82f0a7f Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:08:14 +0700
Subject: [PATCH v15 3/4] Tests for parallel autovacuum

---
 src/backend/commands/vacuumparallel.c         |   8 +
 src/backend/postmaster/autovacuum.c           |  14 +
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/test_autovacuum/.gitignore   |   2 +
 src/test/modules/test_autovacuum/Makefile     |  26 ++
 src/test/modules/test_autovacuum/meson.build  |  36 +++
 .../modules/test_autovacuum/t/001_basic.pl    | 165 ++++++++++++
 .../test_autovacuum/test_autovacuum--1.0.sql  |  34 +++
 .../modules/test_autovacuum/test_autovacuum.c | 255 ++++++++++++++++++
 .../test_autovacuum/test_autovacuum.control   |   3 +
 11 files changed, 545 insertions(+)
 create mode 100644 src/test/modules/test_autovacuum/.gitignore
 create mode 100644 src/test/modules/test_autovacuum/Makefile
 create mode 100644 src/test/modules/test_autovacuum/meson.build
 create mode 100644 src/test/modules/test_autovacuum/t/001_basic.pl
 create mode 100644 src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
 create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.c
 create mode 100644 src/test/modules/test_autovacuum/test_autovacuum.control

diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 9a258238650..0cfdf79cb6c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -39,6 +39,7 @@
 #include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 
@@ -752,6 +753,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 		}
 	}
 
+	/*
+	 * To be able to exercise whether all reserved parallel workers are being
+	 * released anyway, allow injection points to trigger a failure at this
+	 * point.
+	 */
+	INJECTION_POINT("autovacuum-trigger-leader-failure", NULL);
+
 	/* Vacuum the indexes that can be processed by only leader process */
 	parallel_vacuum_process_unsafe_indexes(pvs);
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index e6a4aa99eae..37c8d268903 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3436,6 +3436,13 @@ AutoVacuumReserveParallelWorkers(int nworkers)
 	/* Remember how many workers we have reserved. */
 	av_nworkers_reserved += nreserved;
 
+	/*
+	 * Injection point to help exercising number of available parallel
+	 * autovacuum workers.
+	 */
+	INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+					&AutoVacuumShmem->av_freeParallelWorkers);
+
 	LWLockRelease(AutovacuumLock);
 	return nreserved;
 }
@@ -3466,6 +3473,13 @@ AutoVacuumReleaseParallelWorkers(int nworkers)
 	/* Don't have to remember these workers anymore. */
 	av_nworkers_reserved -= nworkers;
 
+	/*
+	 * Injection point to help exercising number of available parallel
+	 * autovacuum workers.
+	 */
+	INJECTION_POINT("autovacuum-set-free-parallel-workers-num",
+					&AutoVacuumShmem->av_freeParallelWorkers);
+
 	LWLockRelease(AutovacuumLock);
 }
 
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 902a7954101..f09d0060248 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -15,6 +15,7 @@ SUBDIRS = \
 		  plsample \
 		  spgist_name_ops \
 		  test_aio \
+		  test_autovacuum \
 		  test_binaryheap \
 		  test_bitmapset \
 		  test_bloomfilter \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 14fc761c4cf..ee7e855def0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -14,6 +14,7 @@ subdir('plsample')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
 subdir('test_aio')
+subdir('test_autovacuum')
 subdir('test_binaryheap')
 subdir('test_bitmapset')
 subdir('test_bloomfilter')
diff --git a/src/test/modules/test_autovacuum/.gitignore b/src/test/modules/test_autovacuum/.gitignore
new file mode 100644
index 00000000000..716e17f5a2a
--- /dev/null
+++ b/src/test/modules/test_autovacuum/.gitignore
@@ -0,0 +1,2 @@
+# Generated subdirectories
+/tmp_check/
diff --git a/src/test/modules/test_autovacuum/Makefile b/src/test/modules/test_autovacuum/Makefile
new file mode 100644
index 00000000000..4cf7344b2ac
--- /dev/null
+++ b/src/test/modules/test_autovacuum/Makefile
@@ -0,0 +1,26 @@
+# src/test/modules/test_autovacuum/Makefile
+
+PGFILEDESC = "test_autovacuum - test code for parallel autovacuum"
+
+MODULE_big = test_autovacuum
+OBJS = \
+	$(WIN32RES) \
+	test_autovacuum.o
+
+EXTENSION = test_autovacuum
+DATA = test_autovacuum--1.0.sql
+
+TAP_TESTS = 1
+
+export enable_injection_points
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/test_autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/test_autovacuum/meson.build b/src/test/modules/test_autovacuum/meson.build
new file mode 100644
index 00000000000..3441e5e49cf
--- /dev/null
+++ b/src/test/modules/test_autovacuum/meson.build
@@ -0,0 +1,36 @@
+# Copyright (c) 2024-2025, PostgreSQL Global Development Group
+
+test_autovacuum_sources = files(
+  'test_autovacuum.c',
+)
+
+if host_system == 'windows'
+  test_autovacuum_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'test_autovacuum',
+    '--FILEDESC', 'test_autovacuum - test code for parallel autovacuum',])
+endif
+
+test_autovacuum = shared_module('test_autovacuum',
+  test_autovacuum_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += test_autovacuum
+
+test_install_data += files(
+  'test_autovacuum.control',
+  'test_autovacuum--1.0.sql',
+)
+
+tests += {
+  'name': 'test_autovacuum',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'env': {
+       'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
+    },
+    'tests': [
+      't/001_basic.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/test_autovacuum/t/001_basic.pl b/src/test/modules/test_autovacuum/t/001_basic.pl
new file mode 100644
index 00000000000..1271768ebd2
--- /dev/null
+++ b/src/test/modules/test_autovacuum/t/001_basic.pl
@@ -0,0 +1,165 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+
+# Configure postgres, so it can launch parallel autovacuum workers, log all
+# information we are interested in and autovacuum works frequently
+$node->append_conf('postgresql.conf', qq{
+	max_worker_processes = 20
+	max_parallel_workers = 20
+	max_parallel_maintenance_workers = 20
+	autovacuum_max_parallel_workers = 10
+	log_min_messages = debug2
+	log_autovacuum_min_duration = 0
+	autovacuum_naptime = '1s'
+	min_parallel_index_scan_size = 0
+	shared_preload_libraries=test_autovacuum
+});
+$node->start;
+
+my $indexes_num = 4;
+my $initial_rows_num = 10_000;
+my $autovacuum_parallel_workers = 2;
+
+# Create table with specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+	CREATE TABLE test_autovac (
+		id SERIAL PRIMARY KEY,
+		col_1 INTEGER,  col_2 INTEGER,  col_3 INTEGER,  col_4 INTEGER
+	) WITH (autovacuum_parallel_workers = $autovacuum_parallel_workers,
+			autovacuum_enabled = false);
+
+	DO \$\$
+	DECLARE
+		i INTEGER;
+	BEGIN
+		FOR i IN 1..$indexes_num LOOP
+			EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+		END LOOP;
+	END \$\$;
+});
+
+# Insert specified tuples num into the table
+$node->safe_psql('postgres', qq{
+	DO \$\$
+	DECLARE
+		i INTEGER;
+	BEGIN
+		FOR i IN 1..$initial_rows_num LOOP
+			INSERT INTO test_autovac VALUES (i, i + 1, i + 2, i + 3);
+		END LOOP;
+	END \$\$;
+});
+
+# Now, create some dead tuples and refresh table statistics
+$node->safe_psql('postgres', qq{
+	UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+	ANALYZE test_autovac;
+});
+
+# Create all functions needed for testing
+$node->safe_psql('postgres', qq{
+	CREATE EXTENSION test_autovacuum;
+	SELECT inj_set_free_workers_attach();
+	SELECT inj_leader_failure_attach();
+});
+
+# Test 1 :
+# Our table has enough indexes and appropriate reloptions, so autovacuum must
+# be able to process it in parallel mode. Just check if it can.
+# Also check whether all requested workers:
+# 	1) launched
+# 	2) correctly released
+
+$node->safe_psql('postgres', qq{
+	ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+# Wait until the parallel autovacuum on table is completed. At the same time,
+# we check that the required number of parallel workers has been started.
+$node->wait_for_log(qr/parallel index vacuum\/cleanup : workers planned = 2, / .
+					qr/workers launched = 2/);
+
+$node->psql('postgres',
+	"SELECT get_parallel_autovacuum_free_workers();",
+	stdout => \$psql_out,
+);
+is($psql_out, 10, 'All parallel workers has been released by the leader');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+	ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+	UPDATE test_autovac SET col_2 = 0 WHERE (col_2 % 3) = 0;
+	ANALYZE test_autovac;
+});
+
+# Test 2:
+# We want parallel autovacuum workers to be released even if leader gets an
+# error. At first, simulate situation, when leader exites due to an ERROR.
+
+$node->safe_psql('postgres', qq(
+	SELECT trigger_leader_failure('ERROR');
+));
+
+$node->safe_psql('postgres', qq{
+	ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/error, triggered by injection point/);
+
+$node->psql('postgres',
+	"SELECT get_parallel_autovacuum_free_workers();",
+	stdout => \$psql_out,
+);
+is($psql_out, 10,
+   'All parallel workers has been released by the leader after ERROR');
+
+# Disable autovacuum on table during preparation for the next test
+$node->append_conf('postgresql.conf', qq{
+	ALTER TABLE test_autovac SET (autovacuum_enabled = false);
+});
+
+# Create more dead tuples
+$node->safe_psql('postgres', qq{
+	UPDATE test_autovac SET col_3 = 0 WHERE (col_3 % 3) = 0;
+	ANALYZE test_autovac;
+});
+
+# Test 3:
+# Same as Test 2, but simulate situation, when leader exites due to FATAL.
+
+$node->safe_psql('postgres', qq(
+	SELECT trigger_leader_failure('FATAL');
+));
+
+$node->safe_psql('postgres', qq{
+	ALTER TABLE test_autovac SET (autovacuum_enabled = true);
+});
+
+$node->wait_for_log(qr/fatal, triggered by injection point/);
+
+$node->psql('postgres',
+	"SELECT get_parallel_autovacuum_free_workers();",
+	stdout => \$psql_out,
+);
+is($psql_out, 10,
+   'All parallel workers has been released by the leader after FATAL');
+
+# Cleanup
+$node->safe_psql('postgres', qq{
+	SELECT inj_set_free_workers_detach();
+	SELECT inj_leader_failure_detach();
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
new file mode 100644
index 00000000000..017d5da85ea
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum--1.0.sql
@@ -0,0 +1,34 @@
+/* src/test/modules/test_autovacuum/test_autovacuum--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION test_autovacuum" to load this file. \quit
+
+/*
+ * Functions for expecting or to interfere autovacuum state
+ */
+CREATE FUNCTION get_parallel_autovacuum_free_workers()
+RETURNS INTEGER STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION trigger_leader_failure(failure_type text)
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+/*
+ * Injection point related functions
+ */
+CREATE FUNCTION inj_set_free_workers_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_set_free_workers_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_attach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
+
+CREATE FUNCTION inj_leader_failure_detach()
+RETURNS VOID STRICT
+AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.c b/src/test/modules/test_autovacuum/test_autovacuum.c
new file mode 100644
index 00000000000..7948f4858ae
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.c
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_autovacuum.c
+ *		Helpers to write tests for parallel autovacuum
+ *
+ * Copyright (c) 2020-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/test/modules/test_autovacuum/test_autovacuum.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "postmaster/autovacuum.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "storage/lwlock.h"
+#include "utils/builtins.h"
+#include "utils/injection_point.h"
+
+PG_MODULE_MAGIC;
+
+typedef enum AVLeaderFaulureType
+{
+	FAIL_NONE,
+	FAIL_ERROR,
+	FAIL_FATAL,
+}			AVLeaderFaulureType;
+
+typedef struct InjPointState
+{
+	bool		enabled_set_free_workers;
+	uint32		free_parallel_workers;
+
+	bool		enabled_leader_failure;
+	AVLeaderFaulureType ftype;
+}			InjPointState;
+
+static InjPointState * inj_point_state;
+
+/* Shared memory init callbacks */
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+
+static void
+test_autovacuum_shmem_request(void)
+{
+	if (prev_shmem_request_hook)
+		prev_shmem_request_hook();
+
+	RequestAddinShmemSpace(sizeof(InjPointState));
+}
+
+static void
+test_autovacuum_shmem_startup(void)
+{
+	bool		found;
+
+	if (prev_shmem_startup_hook)
+		prev_shmem_startup_hook();
+
+	/* Create or attach to the shared memory state */
+	LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+	inj_point_state = ShmemInitStruct("injection_points",
+									  sizeof(InjPointState),
+									  &found);
+
+	if (!found)
+	{
+		/* First time through, initialize */
+		inj_point_state->enabled_leader_failure = false;
+		inj_point_state->enabled_set_free_workers = false;
+		inj_point_state->ftype = FAIL_NONE;
+
+		/* Keep it in sync with AutoVacuumShmemInit */
+		inj_point_state->free_parallel_workers =
+			Min(autovacuum_max_parallel_workers, max_worker_processes);
+
+		InjectionPointAttach("autovacuum-set-free-parallel-workers-num",
+							 "test_autovacuum",
+							 "inj_set_free_workers",
+							 NULL,
+							 0);
+
+		InjectionPointAttach("autovacuum-trigger-leader-failure",
+							 "test_autovacuum",
+							 "inj_trigger_leader_failure",
+							 NULL,
+							 0);
+	}
+
+	LWLockRelease(AddinShmemInitLock);
+}
+
+void
+_PG_init(void)
+{
+	if (!process_shared_preload_libraries_in_progress)
+		return;
+
+	prev_shmem_request_hook = shmem_request_hook;
+	shmem_request_hook = test_autovacuum_shmem_request;
+	prev_shmem_startup_hook = shmem_startup_hook;
+	shmem_startup_hook = test_autovacuum_shmem_startup;
+}
+
+extern PGDLLEXPORT void inj_set_free_workers(const char *name,
+											 const void *private_data,
+											 void *arg);
+extern PGDLLEXPORT void inj_trigger_leader_failure(const char *name,
+												   const void *private_data,
+												   void *arg);
+
+/*
+ * Set number of currently available parallel a/v workers. This value may
+ * change after reserving or releasing such workers.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_set_free_workers(const char *name, const void *private_data, void *arg)
+{
+	ereport(LOG,
+			errmsg("set parallel workers injection point called"),
+			errhidestmt(true), errhidecontext(true));
+
+	if (inj_point_state->enabled_set_free_workers)
+	{
+		Assert(arg != NULL);
+		inj_point_state->free_parallel_workers = *(uint32 *) arg;
+	}
+}
+
+/*
+ * Throw an ERROR or FATAL, if somebody requested it.
+ *
+ * Function called from parallel autovacuum leader.
+ */
+void
+inj_trigger_leader_failure(const char *name, const void *private_data,
+						   void *arg)
+{
+	int			elevel;
+	char	   *elevel_str;
+
+	ereport(LOG,
+			errmsg("trigger leader failure injection point called"),
+			errhidestmt(true), errhidecontext(true));
+
+	if (inj_point_state->ftype == FAIL_NONE ||
+		!inj_point_state->enabled_leader_failure)
+	{
+		return;
+	}
+
+	elevel = inj_point_state->ftype == FAIL_ERROR ? ERROR : FATAL;
+	elevel_str = elevel == ERROR ? "error" : "fatal";
+
+	ereport(elevel, errmsg("%s, triggered by injection point", elevel_str));
+}
+
+PG_FUNCTION_INFO_V1(get_parallel_autovacuum_free_workers);
+Datum
+get_parallel_autovacuum_free_workers(PG_FUNCTION_ARGS)
+{
+	uint32		nworkers;
+
+#ifndef USE_INJECTION_POINTS
+	elog(ERROR, "injection points not supported");
+#endif
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+	nworkers = inj_point_state->free_parallel_workers;
+	LWLockRelease(AutovacuumLock);
+
+	PG_RETURN_UINT32(nworkers);
+}
+
+PG_FUNCTION_INFO_V1(trigger_leader_failure);
+Datum
+trigger_leader_failure(PG_FUNCTION_ARGS)
+{
+	const char *failure_type = text_to_cstring(PG_GETARG_TEXT_PP(0));
+
+#ifndef USE_INJECTION_POINTS
+	elog(ERROR, "injection points not supported");
+#endif
+
+	if (strcmp(failure_type, "NONE") == 0)
+		inj_point_state->ftype = FAIL_NONE;
+	else if (strcmp(failure_type, "ERROR") == 0)
+		inj_point_state->ftype = FAIL_ERROR;
+	else if (strcmp(failure_type, "FATAL") == 0)
+		inj_point_state->ftype = FAIL_FATAL;
+	else
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid leader failure type : %s", failure_type)));
+
+	PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_attach);
+Datum
+inj_set_free_workers_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	inj_point_state->enabled_set_free_workers = true;
+	inj_point_state->ftype = FAIL_NONE;
+#else
+	elog(ERROR, "injection points not supported");
+#endif
+	PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_set_free_workers_detach);
+Datum
+inj_set_free_workers_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	inj_point_state->enabled_set_free_workers = false;
+#else
+	elog(ERROR, "injection points not supported");
+#endif
+	PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_attach);
+Datum
+inj_leader_failure_attach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	inj_point_state->enabled_leader_failure = true;
+#else
+	elog(ERROR, "injection points not supported");
+#endif
+	PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(inj_leader_failure_detach);
+Datum
+inj_leader_failure_detach(PG_FUNCTION_ARGS)
+{
+#ifdef USE_INJECTION_POINTS
+	inj_point_state->enabled_leader_failure = false;
+#else
+	elog(ERROR, "injection points not supported");
+#endif
+	PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_autovacuum/test_autovacuum.control b/src/test/modules/test_autovacuum/test_autovacuum.control
new file mode 100644
index 00000000000..1b7fad258f0
--- /dev/null
+++ b/src/test/modules/test_autovacuum/test_autovacuum.control
@@ -0,0 +1,3 @@
+comment = 'Test code for parallel autovacuum'
+default_version = '1.0'
+module_pathname = '$libdir/test_autovacuum'
-- 
2.43.0

From 6c6806211a364519150138be6aff9f749e708252 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <[email protected]>
Date: Sun, 23 Nov 2025 01:03:24 +0700
Subject: [PATCH v15 1/4] Parallel autovacuum

---
 src/backend/access/common/reloptions.c        |  11 ++
 src/backend/commands/vacuumparallel.c         |  42 ++++-
 src/backend/postmaster/autovacuum.c           | 164 +++++++++++++++++-
 src/backend/utils/init/globals.c              |   1 +
 src/backend/utils/misc/guc.c                  |   8 +-
 src/backend/utils/misc/guc_parameters.dat     |   9 +
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/bin/psql/tab-complete.in.c                |   1 +
 src/include/miscadmin.h                       |   1 +
 src/include/postmaster/autovacuum.h           |   4 +
 src/include/utils/rel.h                       |   7 +
 11 files changed, 240 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9e288dfecbf..3cc29d4454a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -222,6 +222,15 @@ static relopt_int intRelOpts[] =
 		},
 		SPGIST_DEFAULT_FILLFACTOR, SPGIST_MIN_FILLFACTOR, 100
 	},
+	{
+		{
+			"autovacuum_parallel_workers",
+			"Maximum number of parallel autovacuum workers that can be used for processing this table.",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		-1, -1, 1024
+	},
 	{
 		{
 			"autovacuum_vacuum_threshold",
@@ -1881,6 +1890,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
 		{"autovacuum_enabled", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+		{"autovacuum_parallel_workers", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, autovacuum_parallel_workers)},
 		{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
 		{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 0feea1d30ec..acd53b85b1c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,7 +1,9 @@
 /*-------------------------------------------------------------------------
  *
  * vacuumparallel.c
- *	  Support routines for parallel vacuum execution.
+ *	  Support routines for parallel vacuum and autovacuum execution. In the
+ *	  comments below, the word "vacuum" will refer to both vacuum and
+ *	  autovacuum.
  *
  * This file contains routines that are intended to support setting up, using,
  * and tearing down a ParallelVacuumState.
@@ -34,6 +36,7 @@
 #include "executor/instrument.h"
 #include "optimizer/paths.h"
 #include "pgstat.h"
+#include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
@@ -373,8 +376,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
 	shared->queryid = pgstat_get_my_query_id();
 	shared->maintenance_work_mem_worker =
 		(nindexes_mwm > 0) ?
-		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
-		maintenance_work_mem;
+		vac_work_mem / Min(parallel_workers, nindexes_mwm) :
+		vac_work_mem;
+
 	shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
 
 	/* Prepare DSA space for dead items */
@@ -553,12 +557,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	int			nindexes_parallel_bulkdel = 0;
 	int			nindexes_parallel_cleanup = 0;
 	int			parallel_workers;
+	int			max_workers;
+
+	max_workers = AmAutoVacuumWorkerProcess() ?
+		autovacuum_max_parallel_workers :
+		max_parallel_maintenance_workers;
 
 	/*
 	 * We don't allow performing parallel operation in standalone backend or
 	 * when parallelism is disabled.
 	 */
-	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+	if (!IsUnderPostmaster || max_workers == 0)
 		return 0;
 
 	/*
@@ -597,8 +606,8 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	parallel_workers = (nrequested > 0) ?
 		Min(nrequested, nindexes_parallel) : nindexes_parallel;
 
-	/* Cap by max_parallel_maintenance_workers */
-	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+	/* Cap by GUC variable */
+	parallel_workers = Min(parallel_workers, max_workers);
 
 	return parallel_workers;
 }
@@ -646,6 +655,13 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 	 */
 	nworkers = Min(nworkers, pvs->pcxt->nworkers);
 
+	/*
+	 * Reserve workers in autovacuum global state. Note, that we may be given
+	 * fewer workers than we requested.
+	 */
+	if (AmAutoVacuumWorkerProcess() && nworkers > 0)
+		nworkers = AutoVacuumReserveParallelWorkers(nworkers);
+
 	/*
 	 * Set index vacuum status and mark whether parallel vacuum worker can
 	 * process it.
@@ -690,6 +706,16 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 
 		LaunchParallelWorkers(pvs->pcxt);
 
+		if (AmAutoVacuumWorkerProcess() &&
+			pvs->pcxt->nworkers_launched < nworkers)
+		{
+			/*
+			 * Tell autovacuum that we could not launch all the previously
+			 * reserved workers.
+			 */
+			AutoVacuumReleaseParallelWorkers(nworkers - pvs->pcxt->nworkers_launched);
+		}
+
 		if (pvs->pcxt->nworkers_launched > 0)
 		{
 			/*
@@ -738,6 +764,10 @@ parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scan
 
 		for (int i = 0; i < pvs->pcxt->nworkers_launched; i++)
 			InstrAccumParallelQuery(&pvs->buffer_usage[i], &pvs->wal_usage[i]);
+
+		/* Also release all previously reserved parallel autovacuum workers */
+		if (AmAutoVacuumWorkerProcess() && pvs->pcxt->nworkers_launched > 0)
+			AutoVacuumReleaseParallelWorkers(pvs->pcxt->nworkers_launched);
 	}
 
 	/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 1c38488f2cb..e6a4aa99eae 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -151,6 +151,12 @@ int			Log_autoanalyze_min_duration = 600000;
 static double av_storage_param_cost_delay = -1;
 static int	av_storage_param_cost_limit = -1;
 
+/*
+ * Variable to keep number of currently reserved parallel autovacuum workers.
+ * It is only relevant for parallel autovacuum leader process.
+ */
+static int	av_nworkers_reserved = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -285,6 +291,8 @@ typedef struct AutoVacuumWorkItem
  * av_workItems		work item array
  * av_nworkersForBalance the number of autovacuum workers to use when
  * 					calculating the per worker cost limit
+ * av_freeParallelWorkers the number of free parallel autovacuum workers
+ * av_maxParallelWorkers the maximum number of parallel autovacuum workers
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -299,6 +307,8 @@ typedef struct
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
 	pg_atomic_uint32 av_nworkersForBalance;
+	uint32		av_freeParallelWorkers;
+	uint32		av_maxParallelWorkers;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -364,6 +374,8 @@ static void autovac_report_workitem(AutoVacuumWorkItem *workitem,
 static void avl_sigusr2_handler(SIGNAL_ARGS);
 static bool av_worker_available(void);
 static void check_av_worker_gucs(void);
+static void adjust_free_parallel_workers(int prev_max_parallel_workers);
+static void AutoVacuumReleaseAllParallelWorkers(void);
 
 
 
@@ -763,6 +775,8 @@ ProcessAutoVacLauncherInterrupts(void)
 	if (ConfigReloadPending)
 	{
 		int			autovacuum_max_workers_prev = autovacuum_max_workers;
+		int			autovacuum_max_parallel_workers_prev =
+			autovacuum_max_parallel_workers;
 
 		ConfigReloadPending = false;
 		ProcessConfigFile(PGC_SIGHUP);
@@ -779,6 +793,15 @@ ProcessAutoVacLauncherInterrupts(void)
 		if (autovacuum_max_workers_prev != autovacuum_max_workers)
 			check_av_worker_gucs();
 
+		/*
+		 * If autovacuum_max_parallel_workers changed, we must take care of
+		 * the correct value of available parallel autovacuum workers in
+		 * shmem.
+		 */
+		if (autovacuum_max_parallel_workers_prev !=
+			autovacuum_max_parallel_workers)
+			adjust_free_parallel_workers(autovacuum_max_parallel_workers_prev);
+
 		/* rebuild the list in case the naptime changed */
 		rebuild_database_list(InvalidOid);
 	}
@@ -1383,6 +1406,17 @@ avl_sigusr2_handler(SIGNAL_ARGS)
  *					  AUTOVACUUM WORKER CODE
  ********************************************************************/
 
+/*
+ * Make sure that all reserved workers are released, if parallel autovacuum
+ * leader is finishing due to FATAL error. Otherwise function have no effect.
+ */
+static void
+autovacuum_worker_before_shmem_exit(int code, Datum arg)
+{
+	if (code != 0)
+		AutoVacuumReleaseAllParallelWorkers();
+}
+
 /*
  * Main entry point for autovacuum worker processes.
  */
@@ -1429,6 +1463,8 @@ AutoVacWorkerMain(const void *startup_data, size_t startup_data_len)
 	pqsignal(SIGFPE, FloatExceptionHandler);
 	pqsignal(SIGCHLD, SIG_DFL);
 
+	before_shmem_exit(autovacuum_worker_before_shmem_exit, 0);
+
 	/*
 	 * Create a per-backend PGPROC struct in shared memory.  We must do this
 	 * before we can use LWLocks or access any shared memory.
@@ -2480,6 +2516,12 @@ do_autovacuum(void)
 		}
 		PG_CATCH();
 		{
+			/*
+			 * Parallel autovacuum can reserve parallel workers. Make sure
+			 * that all reserved workers are released.
+			 */
+			AutoVacuumReleaseAllParallelWorkers();
+
 			/*
 			 * Abort the transaction, start a new one, and proceed with the
 			 * next table in our list.
@@ -2880,8 +2922,12 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 */
 		tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
 		tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
-		/* As of now, we don't support parallel vacuum for autovacuum */
-		tab->at_params.nworkers = -1;
+
+		/* Decide whether we need to process indexes of table in parallel. */
+		tab->at_params.nworkers = avopts
+			? avopts->autovacuum_parallel_workers
+			: -1;
+
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
 		tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
@@ -3358,6 +3404,85 @@ AutoVacuumRequestWork(AutoVacuumWorkItemType type, Oid relationId,
 	return result;
 }
 
+/*
+ * In order to meet the 'autovacuum_max_parallel_workers' limit, leader
+ * autovacuum process must call this function. It returns the number of
+ * parallel workers that actually can be launched and reserves these workers
+ * (if any) in global autovacuum state.
+ *
+ * NOTE: We will try to provide as many workers as requested, even if caller
+ * will occupy all available workers.
+ */
+int
+AutoVacuumReserveParallelWorkers(int nworkers)
+{
+	int			nreserved;
+
+	/* Only leader worker can call this function. */
+	Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+	/*
+	 * We can only reserve workers at the beginning of parallel index
+	 * processing, so we must not have any reserved workers right now.
+	 */
+	Assert(av_nworkers_reserved == 0);
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+	/* Provide as many workers as we can. */
+	nreserved = Min(AutoVacuumShmem->av_freeParallelWorkers, nworkers);
+	AutoVacuumShmem->av_freeParallelWorkers -= nreserved;
+
+	/* Remember how many workers we have reserved. */
+	av_nworkers_reserved += nreserved;
+
+	LWLockRelease(AutovacuumLock);
+	return nreserved;
+}
+
+/*
+ * Leader autovacuum process must call this function in order to update global
+ * autovacuum state, so other leaders will be able to use these parallel
+ * workers.
+ *
+ * 'nworkers' - how many workers caller wants to release.
+ */
+void
+AutoVacuumReleaseParallelWorkers(int nworkers)
+{
+	/* Only leader worker can call this function. */
+	Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+	/*
+	 * If the maximum number of parallel workers was reduced during execution,
+	 * we must cap available workers number by its new value.
+	 */
+	AutoVacuumShmem->av_freeParallelWorkers =
+		Min(AutoVacuumShmem->av_freeParallelWorkers + nworkers,
+			AutoVacuumShmem->av_maxParallelWorkers);
+
+	/* Don't have to remember these workers anymore. */
+	av_nworkers_reserved -= nworkers;
+
+	LWLockRelease(AutovacuumLock);
+}
+
+/*
+ * Same as above, but release *all* parallel workers, that were reserved by
+ * current leader autovacuum process.
+ */
+static void
+AutoVacuumReleaseAllParallelWorkers(void)
+{
+	/* Only leader worker can call this function. */
+	Assert(AmAutoVacuumWorkerProcess() && !IsParallelWorker());
+
+	if (av_nworkers_reserved > 0)
+		AutoVacuumReleaseParallelWorkers(av_nworkers_reserved);
+}
+
 /*
  * autovac_init
  *		This is called at postmaster initialization.
@@ -3418,6 +3543,10 @@ AutoVacuumShmemInit(void)
 		Assert(!found);
 
 		AutoVacuumShmem->av_launcherpid = 0;
+		AutoVacuumShmem->av_maxParallelWorkers =
+			Min(autovacuum_max_parallel_workers, max_worker_processes);
+		AutoVacuumShmem->av_freeParallelWorkers =
+			AutoVacuumShmem->av_maxParallelWorkers;
 		dclist_init(&AutoVacuumShmem->av_freeWorkers);
 		dlist_init(&AutoVacuumShmem->av_runningWorkers);
 		AutoVacuumShmem->av_startingWorker = NULL;
@@ -3499,3 +3628,34 @@ check_av_worker_gucs(void)
 				 errdetail("The server will only start up to \"autovacuum_worker_slots\" (%d) autovacuum workers at a given time.",
 						   autovacuum_worker_slots)));
 }
+
+/*
+ * Make sure that number of free parallel workers corresponds to the
+ * autovacuum_max_parallel_workers parameter (after it was changed).
+ */
+static void
+adjust_free_parallel_workers(int prev_max_parallel_workers)
+{
+	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
+
+	/*
+	 * Cap the number of free workers by new parameter's value, if needed.
+	 */
+	AutoVacuumShmem->av_freeParallelWorkers =
+		Min(AutoVacuumShmem->av_freeParallelWorkers,
+			autovacuum_max_parallel_workers);
+
+	if (autovacuum_max_parallel_workers > prev_max_parallel_workers)
+	{
+		/*
+		 * If user wants to increase number of parallel autovacuum workers, we
+		 * must increase number of free workers.
+		 */
+		AutoVacuumShmem->av_freeParallelWorkers +=
+			(autovacuum_max_parallel_workers - prev_max_parallel_workers);
+	}
+
+	AutoVacuumShmem->av_maxParallelWorkers = autovacuum_max_parallel_workers;
+
+	LWLockRelease(AutovacuumLock);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index d31cb45a058..fd00d6f89dc 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -143,6 +143,7 @@ int			NBuffers = 16384;
 int			MaxConnections = 100;
 int			max_worker_processes = 8;
 int			max_parallel_workers = 8;
+int			autovacuum_max_parallel_workers = 0;
 int			MaxBackends = 0;
 
 /* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c6484aea087..2a037485d5e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3326,9 +3326,13 @@ set_config_with_handle(const char *name, config_handle *handle,
 	 *
 	 * Also allow normal setting if the GUC is marked GUC_ALLOW_IN_PARALLEL.
 	 *
-	 * Other changes might need to affect other workers, so forbid them.
+	 * Other changes might need to affect other workers, so forbid them. Note,
+	 * that parallel autovacuum leader is an exception, because only
+	 * cost-based delays need to be affected also to parallel vacuum workers,
+	 * and we will handle it elsewhere if appropriate.
 	 */
-	if (IsInParallelMode() && changeVal && action != GUC_ACTION_SAVE &&
+	if (IsInParallelMode() && !AmAutoVacuumWorkerProcess() && changeVal &&
+		action != GUC_ACTION_SAVE &&
 		(record->flags & GUC_ALLOW_IN_PARALLEL) == 0)
 	{
 		ereport(elevel,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 1128167c025..6c38275d30b 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -154,6 +154,15 @@
   max => '2000000000',
 },
 
+{ name => 'autovacuum_max_parallel_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
+  short_desc => 'Maximum number of parallel autovacuum workers, that can be taken from bgworkers pool.',
+  long_desc => 'This parameter is capped by "max_worker_processes" (not by "autovacuum_max_workers"!).',
+  variable => 'autovacuum_max_parallel_workers',
+  boot_val => '2',
+  min => '0',
+  max => 'MAX_BACKENDS',
+},
+
 { name => 'autovacuum_max_workers', type => 'int', context => 'PGC_SIGHUP', group => 'VACUUM_AUTOVACUUM',
   short_desc => 'Sets the maximum number of simultaneously running autovacuum worker processes.',
   variable => 'autovacuum_max_workers',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..86c67b790b0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -691,6 +691,8 @@
 #autovacuum_worker_slots = 16           # autovacuum worker slots to allocate
                                         # (change requires restart)
 #autovacuum_max_workers = 3             # max number of autovacuum subprocesses
+#autovacuum_max_parallel_workers = 2    # disabled by default and limited by
+                                        # max_worker_processes
 #autovacuum_naptime = 1min              # time between autovacuum runs
 #autovacuum_vacuum_threshold = 50       # min number of row updates before
                                         # vacuum
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 51806597037..6170436b341 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -1423,6 +1423,7 @@ static const char *const table_storage_parameters[] = {
 	"autovacuum_multixact_freeze_max_age",
 	"autovacuum_multixact_freeze_min_age",
 	"autovacuum_multixact_freeze_table_age",
+	"autovacuum_parallel_workers",
 	"autovacuum_vacuum_cost_delay",
 	"autovacuum_vacuum_cost_limit",
 	"autovacuum_vacuum_insert_scale_factor",
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9a7d733ddef..605d0829b03 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -178,6 +178,7 @@ extern PGDLLIMPORT int MaxBackends;
 extern PGDLLIMPORT int MaxConnections;
 extern PGDLLIMPORT int max_worker_processes;
 extern PGDLLIMPORT int max_parallel_workers;
+extern PGDLLIMPORT int autovacuum_max_parallel_workers;
 
 extern PGDLLIMPORT int commit_timestamp_buffers;
 extern PGDLLIMPORT int multixact_member_buffers;
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 023ac6d5fa8..23cb531c68c 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -65,6 +65,10 @@ pg_noreturn extern void AutoVacWorkerMain(const void *startup_data, size_t start
 extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type,
 								  Oid relationId, BlockNumber blkno);
 
+/* parallel autovacuum stuff */
+extern int	AutoVacuumReserveParallelWorkers(int nworkers);
+extern void AutoVacuumReleaseParallelWorkers(int nworkers);
+
 /* shared memory stuff */
 extern Size AutoVacuumShmemSize(void);
 extern void AutoVacuumShmemInit(void);
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 80286076a11..e879fdcfc69 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,13 @@ typedef struct ForeignKeyCacheInfo
 typedef struct AutoVacOpts
 {
 	bool		enabled;
+
+	/*
+	 * Max number of parallel autovacuum workers. If value is 0 then parallel
+	 * degree will computed based on number of indexes.
+	 */
+	int			autovacuum_parallel_workers;
+
 	int			vacuum_threshold;
 	int			vacuum_max_threshold;
 	int			vacuum_ins_threshold;
-- 
2.43.0

Re: POC: Parallel processing of indexes in autovacuum

Reply via email to