Re: POC: Parallel processing of indexes in autovacuum

Daniil Davydov Thu, 22 May 2025 00:44:30 -0700

Hi,

On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada <sawada.m...@gmail.com> wrote:
>
> I have some comments on v2-0001 patch


Thank you for reviewing this patch!

> +   {
> +       {"autovacuum_reserved_workers_num", PGC_USERSET,
> RESOURCES_WORKER_PROCESSES,
> +           gettext_noop("Number of worker processes, reserved for
> participation in parallel index processing during autovacuum."),
> +           gettext_noop("This parameter is depending on
> \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> +                        "*Only* autovacuum workers can use these
> additional processes. "
> +                        "Also, these processes are taken into account
> in \"max_parallel_workers\"."),
> +       },
> +       &av_reserved_workers_num,
> +       0, 0, MAX_BACKENDS,
> +       check_autovacuum_reserved_workers_num, NULL, NULL
> +   },
>
> I find that the name "autovacuum_reserved_workers_num" is generic. It
> would be better to have a more specific name for parallel vacuum such
> as autovacuum_max_parallel_workers. This parameter is related to
> neither autovacuum_worker_slots nor autovacuum_max_workers, which
> seems fine to me. Also, max_parallel_maintenance_workers doesn't
> affect this parameter.
> .......
> I've also considered some alternative names. If we were to use
> parallel_maintenance_workers, it sounds like it controls the parallel
> degree for all operations using max_parallel_maintenance_workers,
> including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> interpreted as affecting both autovacuum and manual VACUUM commands,
> suggesting that when users run "VACUUM (PARALLEL) t", the system would
> use their specified value for the parallel degree. I prefer
> autovacuum_parallel_workers or vacuum_parallel_workers.
>

This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.

> Which number does this parameter mean to specify: the maximum number
> of parallel vacuum workers that can be used during autovacuum or the
> maximum number of parallel vacuum workers that each autovacuum can
> use?

First variant. I will concrete this in the variable's description.

> +   {
> +       {
> +           "parallel_idx_autovac_enabled",
> +           "Allows autovacuum to process indexes of this table in
> parallel mode",
> +           RELOPT_KIND_HEAP,
> +           ShareUpdateExclusiveLock
> +       },
> +       false
> +   },
>
> The proposed reloption name doesn't align with our naming conventions.
> Looking at our existing reloptions, we typically write out full words
> rather than using abbreviations like 'autovac' or 'idx'.
>
> The new reloption name seems not to follow the conventional naming
> style for existing reloption. For instance, we don't use abbreviations
> such as 'autovac' and 'idx'.

OK, I'll fix it.

> +   /*
> +    * If we are running autovacuum - decide whether we need to process 
> indexes
> +    * of table with given oid in parallel.
> +    */
> +   if (AmAutoVacuumWorkerProcess() &&
> +       params->index_cleanup != VACOPTVALUE_DISABLED &&
> +       RelationAllowsParallelIdxAutovac(rel))
>
> I think that this should be done in autovacuum code.

We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
 * index_cleanup and truncate are unspecified at first in autovacuum.
 * They will be filled in with usable values using their reloptions
 * (or reloption defaults) later.
 */
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.

> +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
>
> These fixed values really useful in common cases? I think we already
> have an optimization where we skip vacuum indexes if the table has
> fewer dead tuples (see BYPASS_THRESHOLD_PAGES).

When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.

----

> I guess we can implement this parameter as an integer parameter so
> that the user can specify the number of parallel vacuum workers for
> the table. For example, we can have a reloption
> autovacuum_parallel_workers. Setting 0 (by default) means to disable
> parallel vacuum during autovacuum, and setting special value -1 means
> to let PostgreSQL calculate the parallel degree for the table (same as
> the default VACUUM command behavior).
> ...........
> The patch includes the changes to bgworker.c so that we can reserve
> some slots for autovacuums. I guess that this change is not
> necessarily necessary because if the user sets the related GUC
> parameters correctly the autovacuum workers can use parallel vacuum as
> expected.  Even if we need this change, I would suggest implementing
> it as a separate patch.
> ..........
> +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> +#define NUM_INDEXES_PER_PARALLEL_WORKER 30
>
> These fixed values really useful in common cases? Given that we rely on
> users' heuristics which table needs to use parallel vacuum during
> autovacuum, I think we don't need to apply these conditions.
> ..........

I grouped these comments together, because they all relate to a single
question : how much freedom will we give to the user?
Your opinion (as far as I understand) is that we allow users to
specify any number of parallel workers for tables, and it is the
user's responsibility to configure appropriate GUC variables, so that
autovacuum can always process indexes in parallel.
And we don't need to think about thresholds. Even if the table has a
small number of indexes and dead rows - if the user specified table
option, we must do a parallel index a/v with requested number of
parallel workers.
Please correct me if I messed something up.

I think that this logic is well suited for the `VACUUM (PARALLEL)` sql
command, which is manually called by the user.

But autovacuum (as I think) should work as stable as possible and
`unnoticed` by other processes. Thus, we must :
1) Compute resources (such as the number of parallel workers for a
single table's indexes vacuuming) as efficiently as possible.
2) Provide a guarantee that as many tables as possible (among
requested) will be processed in parallel.

(1) can be achieved by calculating the parameters on the fly.
NUM_INDEXES_PER_PARALLEL_WORKER is a rough mock. I can provide more
accurate value in the near future.
(2) can be achieved by workers reserving - we know that N workers
(from bgworkers pool) are *always* at our disposal. And when we use
such workers we are not dependent on other operations in the cluster
and we don't interfere with other operations by taking resources away
from them.

If we give the user too much freedom in parallel index a/v tuning, all
these requirements may be violated.

This is only my opinion, and I can agree with yours. Maybe we need
another person to judge us?
Please see v3 patches that contain changes related to GUC parameter
and table option (no changes in global logic by now).

--
Best regards,
Daniil Davydov

From 2223da7a9b2ef8c8d71780ad72b24eaf6d6c1141 Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davy...@postgrespro.ru>
Date: Fri, 16 May 2025 11:58:40 +0700
Subject: [PATCH v3 1/2] Parallel index autovacuum with bgworkers

---
 src/backend/access/common/reloptions.c        | 11 ++++
 src/backend/commands/vacuum.c                 | 55 +++++++++++++++++++
 src/backend/commands/vacuumparallel.c         | 46 ++++++++++------
 src/backend/postmaster/autovacuum.c           | 14 ++++-
 src/backend/postmaster/bgworker.c             | 33 ++++++++++-
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 12 ++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/miscadmin.h                       |  1 +
 src/include/utils/guc_hooks.h                 |  2 +
 src/include/utils/rel.h                       | 10 ++++
 11 files changed, 166 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46c1dce222d..730096002b1 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -166,6 +166,15 @@ static relopt_bool boolRelOpts[] =
 		},
 		true
 	},
+	{
+		{
+			"parallel_index_autovacuum_enabled",
+			"Allows autovacuum to process indexes of this table in parallel mode",
+			RELOPT_KIND_HEAP,
+			ShareUpdateExclusiveLock
+		},
+		false
+	},
 	/* list terminator */
 	{{NULL}}
 };
@@ -1863,6 +1872,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		{"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
 		{"autovacuum_enabled", RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, enabled)},
+		{"parallel_index_autovacuum_enabled", RELOPT_TYPE_BOOL,
+		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, parallel_index_autovacuum_enabled)},
 		{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_threshold)},
 		{"autovacuum_vacuum_max_threshold", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..6c2f49f203f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -57,9 +57,21 @@
 #include "utils/guc.h"
 #include "utils/guc_hooks.h"
 #include "utils/memutils.h"
+#include "utils/rel.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 
+/*
+ * Minimum number of dead tuples required for the table's indexes to be
+ * processed in parallel during autovacuum.
+ */
+#define AV_PARALLEL_DEADTUP_THRESHOLD	1024
+
+/*
+ * How many indexes should process each parallel worker during autovacuum.
+ */
+#define NUM_INDEXES_PER_PARALLEL_WORKER 30
+
 /*
  * Minimum interval for cost-based vacuum delay reports from a parallel worker.
  * This aims to avoid sending too many messages and waking up the leader too
@@ -2234,6 +2246,49 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
 	else
 		toast_relid = InvalidOid;
 
+	/*
+	 * If we are running autovacuum - decide whether we need to process indexes
+	 * of table with given oid in parallel.
+	 */
+	if (AmAutoVacuumWorkerProcess() &&
+		params->index_cleanup != VACOPTVALUE_DISABLED &&
+		RelationAllowsParallelIdxAutovac(rel))
+	{
+		PgStat_StatTabEntry *tabentry;
+
+		/* fetch the pgstat table entry */
+		tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+												  rel->rd_id);
+		if (tabentry && tabentry->dead_tuples >= AV_PARALLEL_DEADTUP_THRESHOLD)
+		{
+			List   *indexes = RelationGetIndexList(rel);
+			int		num_indexes = list_length(indexes);
+
+			list_free(indexes);
+
+			if (pia_reserved_workers > 0)
+			{
+				/*
+				 * We request at least one parallel worker, if user set
+				 * 'parallel_idx_autovac_enabled' option. The total number of
+				 * additional parallel workers depends on how many indexes the
+				 * table has. For now we assume that each parallel worker should
+				 * process NUM_INDEXES_PER_PARALLEL_WORKER indexes.
+				 */
+				params->nworkers =
+					Min((num_indexes / NUM_INDEXES_PER_PARALLEL_WORKER) + 1,
+						pia_reserved_workers);
+			}
+			else
+				ereport(WARNING,
+					(errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+					 errmsg("Cannot launch any supportive workers for parallel index cleanup of rel %s",
+							RelationGetRelationName(rel)),
+					errhint("You might need to set parameter \"pia_reserved_workers\" to a value > 0")));
+
+		}
+	}
+
 	/*
 	 * Switch to the table owner's userid, so that any index functions are run
 	 * as that user.  Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..5c48a1e740e 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,15 +1,15 @@
 /*-------------------------------------------------------------------------
  *
  * vacuumparallel.c
- *	  Support routines for parallel vacuum execution.
+ *	  Support routines for parallel [auto]vacuum execution.
  *
  * This file contains routines that are intended to support setting up, using,
  * and tearing down a ParallelVacuumState.
  *
- * In a parallel vacuum, we perform both index bulk deletion and index cleanup
- * with parallel worker processes.  Individual indexes are processed by one
- * vacuum process.  ParallelVacuumState contains shared information as well as
- * the memory space for storing dead items allocated in the DSA area.  We
+ * In a parallel [auto]vacuum, we perform both index bulk deletion and index
+ * cleanup with parallel worker processes.  Individual indexes are processed by
+ * one vacuum process.  ParallelVacuumState contains shared information as well
+ * as the memory space for storing dead items allocated in the DSA area.  We
  * launch parallel worker processes at the start of parallel index
  * bulk-deletion and index cleanup and once all indexes are processed, the
  * parallel worker processes exit.  Each time we process indexes in parallel,
@@ -34,6 +34,7 @@
 #include "executor/instrument.h"
 #include "optimizer/paths.h"
 #include "pgstat.h"
+#include "postmaster/autovacuum.h"
 #include "storage/bufmgr.h"
 #include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
@@ -157,7 +158,8 @@ typedef struct PVIndStats
 } PVIndStats;
 
 /*
- * Struct for maintaining a parallel vacuum state. typedef appears in vacuum.h.
+ * Struct for maintaining a parallel [auto]vacuum state. typedef appears in
+ * vacuum.h.
  */
 struct ParallelVacuumState
 {
@@ -371,10 +373,18 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
 	shared->relid = RelationGetRelid(rel);
 	shared->elevel = elevel;
 	shared->queryid = pgstat_get_my_query_id();
-	shared->maintenance_work_mem_worker =
-		(nindexes_mwm > 0) ?
-		maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
-		maintenance_work_mem;
+
+	if (AmAutoVacuumWorkerProcess())
+		shared->maintenance_work_mem_worker =
+			(nindexes_mwm > 0) ?
+			autovacuum_work_mem / Min(parallel_workers, nindexes_mwm) :
+			autovacuum_work_mem;
+	else
+		shared->maintenance_work_mem_worker =
+			(nindexes_mwm > 0) ?
+			maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
+			maintenance_work_mem;
+
 	shared->dead_items_info.max_bytes = vac_work_mem * (size_t) 1024;
 
 	/* Prepare DSA space for dead items */
@@ -558,7 +568,9 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	 * We don't allow performing parallel operation in standalone backend or
 	 * when parallelism is disabled.
 	 */
-	if (!IsUnderPostmaster || max_parallel_maintenance_workers == 0)
+	if (!IsUnderPostmaster ||
+		(pia_reserved_workers == 0 && AmAutoVacuumWorkerProcess()) ||
+		(max_parallel_maintenance_workers == 0 && !AmAutoVacuumWorkerProcess()))
 		return 0;
 
 	/*
@@ -597,15 +609,17 @@ parallel_vacuum_compute_workers(Relation *indrels, int nindexes, int nrequested,
 	parallel_workers = (nrequested > 0) ?
 		Min(nrequested, nindexes_parallel) : nindexes_parallel;
 
-	/* Cap by max_parallel_maintenance_workers */
-	parallel_workers = Min(parallel_workers, max_parallel_maintenance_workers);
+	/* Cap by GUC variable */
+	parallel_workers = AmAutoVacuumWorkerProcess() ?
+		Min(parallel_workers, pia_reserved_workers) :
+		Min(parallel_workers, max_parallel_maintenance_workers);
 
 	return parallel_workers;
 }
 
 /*
  * Perform index vacuum or index cleanup with parallel workers.  This function
- * must be used by the parallel vacuum leader process.
+ * must be used by the parallel [auto]vacuum leader process.
  */
 static void
 parallel_vacuum_process_all_indexes(ParallelVacuumState *pvs, int num_index_scans,
@@ -982,8 +996,8 @@ parallel_vacuum_index_is_parallel_safe(Relation indrel, int num_index_scans,
 /*
  * Perform work within a launched parallel process.
  *
- * Since parallel vacuum workers perform only index vacuum or index cleanup,
- * we don't need to report progress information.
+ * Since parallel [auto]vacuum workers perform only index vacuum or index
+ * cleanup, we don't need to report progress information.
  */
 void
 parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..59fb52aa443 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2824,7 +2824,11 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 */
 		tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
 		tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
-		/* As of now, we don't support parallel vacuum for autovacuum */
+		/*
+		 * Don't request parallel mode by now. nworkers might be set to
+		 * positive value if we will meet appropriate for parallel index
+		 * processing table.
+		 */
 		tab->at_params.nworkers = -1;
 		tab->at_params.freeze_min_age = freeze_min_age;
 		tab->at_params.freeze_table_age = freeze_table_age;
@@ -3406,6 +3410,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source)
 	return true;
 }
 
+bool
+check_pia_reserved_workers(int *newval, void **extra, GucSource source)
+{
+	if (*newval > (max_worker_processes - 8))
+		return false;
+	return true;
+}
+
 /*
  * Returns whether there is a free autovacuum worker slot available.
  */
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 116ddf7b835..e62076939ec 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1046,6 +1046,8 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
 								BackgroundWorkerHandle **handle)
 {
 	int			slotno;
+	int			from;
+	int			upto;
 	bool		success = false;
 	bool		parallel;
 	uint64		generation = 0;
@@ -1088,10 +1090,23 @@ RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
 		return false;
 	}
 
+	/*
+	 * Determine range of workers in pool, that we can use (last
+	 * 'pia_reserved_workers' is reserved for autovacuum workers).
+	 */
+
+	from = AmAutoVacuumWorkerProcess() ?
+		BackgroundWorkerData->total_slots - pia_reserved_workers :
+		0;
+
+	upto = AmAutoVacuumWorkerProcess() ?
+		BackgroundWorkerData->total_slots :
+		BackgroundWorkerData->total_slots - pia_reserved_workers;
+
 	/*
 	 * Look for an unused slot.  If we find one, grab it.
 	 */
-	for (slotno = 0; slotno < BackgroundWorkerData->total_slots; ++slotno)
+	for (slotno = from; slotno < upto; ++slotno)
 	{
 		BackgroundWorkerSlot *slot = &BackgroundWorkerData->slot[slotno];
 
@@ -1159,7 +1174,13 @@ GetBackgroundWorkerPid(BackgroundWorkerHandle *handle, pid_t *pidp)
 	BackgroundWorkerSlot *slot;
 	pid_t		pid;
 
-	Assert(handle->slot < max_worker_processes);
+	/* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+	if (!AmAutoVacuumWorkerProcess())
+		Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+	else
+		Assert(handle->slot < max_worker_processes &&
+			   handle->slot >= max_worker_processes - pia_reserved_workers);
+
 	slot = &BackgroundWorkerData->slot[handle->slot];
 
 	/*
@@ -1298,7 +1319,13 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
 	BackgroundWorkerSlot *slot;
 	bool		signal_postmaster = false;
 
-	Assert(handle->slot < max_worker_processes);
+	/* Only autovacuum can use last 'pia_reserved_workers' workers in pool. */
+	if (!AmAutoVacuumWorkerProcess())
+		Assert(handle->slot < max_worker_processes - pia_reserved_workers);
+	else
+		Assert(handle->slot < max_worker_processes &&
+			   handle->slot >= max_worker_processes - pia_reserved_workers);
+
 	slot = &BackgroundWorkerData->slot[handle->slot];
 
 	/* Set terminate flag in shared memory, unless slot has been reused. */
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 92b0446b80c..a6fdcd2de5b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -144,6 +144,7 @@ int			NBuffers = 16384;
 int			MaxConnections = 100;
 int			max_worker_processes = 8;
 int			max_parallel_workers = 8;
+int         pia_reserved_workers = 0;
 int			MaxBackends = 0;
 
 /* GUC parameters for vacuum */
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f8cbd86759..dfc18095d7b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3604,6 +3604,18 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"parallel_index_autovacuum_reserved_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
+			gettext_noop("Maximum number of worker processes (from bgworkers pool), reserved for participation in parallel index autovacuum."),
+			gettext_noop("This parameter is depending on \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
+						 "*Only* autovacuum workers can use these supportive processes. "
+						 "Also, these processes are taken into account in \"max_parallel_workers\"."),
+		},
+		&pia_reserved_workers,
+		0, 0, MAX_BACKENDS,
+		check_pia_reserved_workers, NULL, NULL
+	},
+
 	{
 		{"max_parallel_maintenance_workers", PGC_USERSET, RESOURCES_WORKER_PROCESSES,
 			gettext_noop("Sets the maximum number of parallel processes per maintenance operation."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34826d01380..3d96af1547f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -223,6 +223,7 @@
 #max_parallel_maintenance_workers = 2	# limited by max_parallel_workers
 #max_parallel_workers = 8		# number of max_worker_processes that
 					# can be used in parallel operations
+#parallel_index_autovacuum_reserved_workers = 0	# disabled by default and limited by max_parallel_workers
 #parallel_leader_participation = on
 
 
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 1e59a7f910f..465dfe25009 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -177,6 +177,7 @@ extern PGDLLIMPORT int NBuffers;
 extern PGDLLIMPORT int MaxBackends;
 extern PGDLLIMPORT int MaxConnections;
 extern PGDLLIMPORT int max_worker_processes;
+extern PGDLLIMPORT int pia_reserved_workers;
 extern PGDLLIMPORT int max_parallel_workers;
 
 extern PGDLLIMPORT int commit_timestamp_buffers;
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 799fa7ace68..8507f95b2ea 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -31,6 +31,8 @@ extern void assign_application_name(const char *newval, void *extra);
 extern const char *show_archive_command(void);
 extern bool check_autovacuum_work_mem(int *newval, void **extra,
 									  GucSource source);
+extern bool check_pia_reserved_workers(int *newval, void **extra,
+									   GucSource source);
 extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra,
 											GucSource source);
 extern bool check_backtrace_functions(char **newval, void **extra,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index b552359915f..980c3459469 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -311,6 +311,7 @@ typedef struct ForeignKeyCacheInfo
 typedef struct AutoVacOpts
 {
 	bool		enabled;
+	bool		parallel_index_autovacuum_enabled;
 	int			vacuum_threshold;
 	int			vacuum_max_threshold;
 	int			vacuum_ins_threshold;
@@ -409,6 +410,15 @@ typedef struct StdRdOptions
 	((relation)->rd_options ? \
 	 ((StdRdOptions *) (relation)->rd_options)->parallel_workers : (defaultpw))
 
+/*
+ * RelationAllowsParallelIdxAutovac
+ *		Returns whether the relation's indexes can be processed in parallel
+ *		during autovacuum. Note multiple eval of argument!
+ */
+#define RelationAllowsParallelIdxAutovac(relation) \
+	((relation)->rd_options ? \
+	 ((StdRdOptions *) (relation)->rd_options)->autovacuum.parallel_index_autovacuum_enabled : false)
+
 /* ViewOptions->check_option values */
 typedef enum ViewOptCheckOption
 {
-- 
2.43.0

From d17a01ef2ace5fc6cfd1d22930454d90cfbe63dd Mon Sep 17 00:00:00 2001
From: Daniil Davidov <d.davy...@postgrespro.ru>
Date: Fri, 16 May 2025 11:59:03 +0700
Subject: [PATCH v3 2/2] Sandbox for parallel index autovacuum

---
 src/test/modules/Makefile                     |   1 +
 src/test/modules/autovacuum/.gitignore        |   1 +
 src/test/modules/autovacuum/Makefile          |  14 ++
 src/test/modules/autovacuum/meson.build       |  12 ++
 .../autovacuum/t/001_autovac_parallel.pl      | 129 ++++++++++++++++++
 src/test/modules/meson.build                  |   1 +
 6 files changed, 158 insertions(+)
 create mode 100644 src/test/modules/autovacuum/.gitignore
 create mode 100644 src/test/modules/autovacuum/Makefile
 create mode 100644 src/test/modules/autovacuum/meson.build
 create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl

diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbed3..b7f3e342e82 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -5,6 +5,7 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 SUBDIRS = \
+		  autovacuum \
 		  brin \
 		  commit_ts \
 		  delay_execution \
diff --git a/src/test/modules/autovacuum/.gitignore b/src/test/modules/autovacuum/.gitignore
new file mode 100644
index 00000000000..0b54641bceb
--- /dev/null
+++ b/src/test/modules/autovacuum/.gitignore
@@ -0,0 +1 @@
+/tmp_check/
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/Makefile b/src/test/modules/autovacuum/Makefile
new file mode 100644
index 00000000000..90c00ff350b
--- /dev/null
+++ b/src/test/modules/autovacuum/Makefile
@@ -0,0 +1,14 @@
+# src/test/modules/autovacuum/Makefile
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/autovacuum
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
\ No newline at end of file
diff --git a/src/test/modules/autovacuum/meson.build b/src/test/modules/autovacuum/meson.build
new file mode 100644
index 00000000000..f91c1a14d2b
--- /dev/null
+++ b/src/test/modules/autovacuum/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'autovacuum',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_autovac_parallel.pl',
+    ],
+  },
+}
diff --git a/src/test/modules/autovacuum/t/001_autovac_parallel.pl b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
new file mode 100644
index 00000000000..5aea3f10e38
--- /dev/null
+++ b/src/test/modules/autovacuum/t/001_autovac_parallel.pl
@@ -0,0 +1,129 @@
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+my $psql_out;
+
+my $node = PostgreSQL::Test::Cluster->new('node1');
+$node->init;
+$node->append_conf('postgresql.conf', qq{
+	autovacuum = off
+	max_wal_size = 4096
+	max_worker_processes = 20
+	max_parallel_workers = 20
+	max_parallel_maintenance_workers = 20
+	parallel_index_autovacuum_reserved_workers = 1
+});
+$node->start;
+
+my $indexes_num = 80;
+my $initial_rows_num = 100_000;
+
+# Create big table and create specified number of b-tree indexes on it
+$node->safe_psql('postgres', qq{
+	CREATE TABLE test_autovac (
+		id SERIAL PRIMARY KEY,
+		col_1 INTEGER,  col_2 INTEGER,  col_3 INTEGER,  col_4 INTEGER,  col_5 INTEGER,
+		col_6 INTEGER,  col_7 INTEGER,  col_8 INTEGER,  col_9 INTEGER,  col_10 INTEGER,
+		col_11 INTEGER, col_12 INTEGER, col_13 INTEGER, col_14 INTEGER, col_15 INTEGER,
+		col_16 INTEGER, col_17 INTEGER, col_18 INTEGER, col_19 INTEGER, col_20 INTEGER,
+		col_21 INTEGER, col_22 INTEGER, col_23 INTEGER, col_24 INTEGER, col_25 INTEGER,
+		col_26 INTEGER, col_27 INTEGER, col_28 INTEGER, col_29 INTEGER, col_30 INTEGER,
+		col_31 INTEGER, col_32 INTEGER, col_33 INTEGER, col_34 INTEGER, col_35 INTEGER,
+		col_36 INTEGER, col_37 INTEGER, col_38 INTEGER, col_39 INTEGER, col_40 INTEGER,
+		col_41 INTEGER, col_42 INTEGER, col_43 INTEGER, col_44 INTEGER, col_45 INTEGER,
+		col_46 INTEGER, col_47 INTEGER, col_48 INTEGER, col_49 INTEGER, col_50 INTEGER,
+		col_51 INTEGER, col_52 INTEGER, col_53 INTEGER, col_54 INTEGER, col_55 INTEGER,
+		col_56 INTEGER, col_57 INTEGER, col_58 INTEGER, col_59 INTEGER, col_60 INTEGER,
+		col_61 INTEGER, col_62 INTEGER, col_63 INTEGER, col_64 INTEGER, col_65 INTEGER,
+		col_66 INTEGER, col_67 INTEGER, col_68 INTEGER, col_69 INTEGER, col_70 INTEGER,
+		col_71 INTEGER, col_72 INTEGER, col_73 INTEGER, col_74 INTEGER, col_75 INTEGER,
+		col_76 INTEGER, col_77 INTEGER, col_78 INTEGER, col_79 INTEGER, col_80 INTEGER,
+		col_81 INTEGER, col_82 INTEGER, col_83 INTEGER, col_84 INTEGER, col_85 INTEGER,
+		col_86 INTEGER, col_87 INTEGER, col_88 INTEGER, col_89 INTEGER, col_90 INTEGER,
+		col_91 INTEGER, col_92 INTEGER, col_93 INTEGER, col_94 INTEGER, col_95 INTEGER,
+		col_96 INTEGER, col_97 INTEGER, col_98 INTEGER, col_99 INTEGER, col_100 INTEGER
+	) WITH (parallel_idx_autovac_enabled = true);
+
+	DO \$\$
+	DECLARE
+		i INTEGER;
+	BEGIN
+		FOR i IN 1..$indexes_num LOOP
+			EXECUTE format('CREATE INDEX idx_col_\%s ON test_autovac (col_\%s);', i, i);
+		END LOOP;
+	END \$\$;
+});
+
+$node->psql('postgres',
+	"SELECT COUNT(*) FROM pg_index i
+	   JOIN pg_class c ON c.oid = i.indrelid
+	  WHERE c.relname = 'test_autovac';",
+	stdout => \$psql_out
+);
+is($psql_out, $indexes_num + 1, "All indexes created successfully");
+
+$node->safe_psql('postgres', qq{
+	DO \$\$
+	DECLARE
+	    i INTEGER;
+	BEGIN
+	    FOR i IN 1..$initial_rows_num LOOP
+	        INSERT INTO test_autovac (
+	            col_1, col_2, col_3, col_4, col_5, col_6, col_7, col_8, col_9, col_10,
+	            col_11, col_12, col_13, col_14, col_15, col_16, col_17, col_18, col_19, col_20,
+	            col_21, col_22, col_23, col_24, col_25, col_26, col_27, col_28, col_29, col_30,
+	            col_31, col_32, col_33, col_34, col_35, col_36, col_37, col_38, col_39, col_40,
+	            col_41, col_42, col_43, col_44, col_45, col_46, col_47, col_48, col_49, col_50,
+	            col_51, col_52, col_53, col_54, col_55, col_56, col_57, col_58, col_59, col_60,
+	            col_61, col_62, col_63, col_64, col_65, col_66, col_67, col_68, col_69, col_70,
+	            col_71, col_72, col_73, col_74, col_75, col_76, col_77, col_78, col_79, col_80,
+	            col_81, col_82, col_83, col_84, col_85, col_86, col_87, col_88, col_89, col_90,
+	            col_91, col_92, col_93, col_94, col_95, col_96, col_97, col_98, col_99, col_100
+	        ) VALUES (
+	            i, i + 1, i + 2, i + 3, i + 4, i + 5, i + 6, i + 7, i + 8, i + 9,
+	            i + 10, i + 11, i + 12, i + 13, i + 14, i + 15, i + 16, i + 17, i + 18, i + 19,
+	            i + 20, i + 21, i + 22, i + 23, i + 24, i + 25, i + 26, i + 27, i + 28, i + 29,
+	            i + 30, i + 31, i + 32, i + 33, i + 34, i + 35, i + 36, i + 37, i + 38, i + 39,
+	            i + 40, i + 41, i + 42, i + 43, i + 44, i + 45, i + 46, i + 47, i + 48, i + 49,
+	            i + 50, i + 51, i + 52, i + 53, i + 54, i + 55, i + 56, i + 57, i + 58, i + 59,
+	            i + 60, i + 61, i + 62, i + 63, i + 64, i + 65, i + 66, i + 67, i + 68, i + 69,
+	            i + 70, i + 71, i + 72, i + 73, i + 74, i + 75, i + 76, i + 77, i + 78, i + 79,
+	            i + 80, i + 81, i + 82, i + 83, i + 84, i + 85, i + 86, i + 87, i + 88, i + 89,
+	            i + 90, i + 91, i + 92, i + 93, i + 94, i + 95, i + 96, i + 97, i + 98, i + 99
+	        );
+	    END LOOP;
+	END \$\$;
+});
+
+$node->psql('postgres',
+	"SELECT COUNT(*) FROM test_autovac;",
+	stdout => \$psql_out
+);
+is($psql_out, $initial_rows_num, "All data inserted into table successfully");
+
+$node->safe_psql('postgres', qq{
+	UPDATE test_autovac SET col_1 = 0 WHERE (col_1 % 3) = 0;
+	ANALYZE test_autovac;
+});
+
+# Reduce autovacuum_work_mem, so leader process will perform parallel indexi
+# vacuum phase several times
+$node->append_conf('postgresql.conf', qq{
+	autovacuum_naptime = '1s'
+	autovacuum_vacuum_threshold = 1
+	autovacuum_analyze_threshold = 1
+	autovacuum_vacuum_scale_factor = 0.1
+	autovacuum_analyze_scale_factor = 0.1
+	autovacuum = on
+});
+
+$node->restart;
+
+# sleep(3600);
+
+ok(1, "There are no segfaults");
+
+$node->stop;
+done_testing();
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd1d..7f2ad810ca0 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2025, PostgreSQL Global Development Group
 
+subdir('autovacuum')
 subdir('brin')
 subdir('commit_ts')
 subdir('delay_execution')
-- 
2.43.0

Re: POC: Parallel processing of indexes in autovacuum

Reply via email to