Thanks for the detailed review!

On Tue, Mar 28, 2023 at 11:09 PM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:
>
> At Tue, 28 Mar 2023 20:35:28 -0400, Melanie Plageman 
> <melanieplage...@gmail.com> wrote in
> > On Tue, Mar 28, 2023 at 4:21 AM Kyotaro Horiguchi <horikyota....@gmail.com> 
> > wrote:
> > >
> > > At Mon, 27 Mar 2023 14:12:03 -0400, Melanie Plageman 
> > > <melanieplage...@gmail.com> wrote in
> > >
> > > 0002:
> > >
> > > I felt a bit uneasy on this. It seems somewhat complex (and makes the
> > > succeeding patches complex),
> >
> > Even if we introduced a second global variable to indicate that failsafe
> > mode has been engaged, we would still require the additional checks
> > of VacuumCostInactive.
> >
> > > has confusing names,
> >
> > I would be happy to rename the values of the enum to make them less
> > confusing. Are you thinking "force" instead of "locked"?
> > maybe:
> > VACUUM_COST_FORCE_INACTIVE and
> > VACUUM_COST_INACTIVE
> > ?
> >
> > > and doesn't seem like self-contained.
> >
> > By changing the variable from VacuumCostActive to VacuumCostInactive, I
> > have kept all non-vacuum code from having to distinguish between it
> > being inactive due to failsafe mode or due to user settings.
>
> My concern is that VacuumCostActive is logic-inverted and turned into
> a ternary variable in a subtle way. The expression
> "!VacuumCostInactive" is quite confusing. (I sometimes feel the same
> way about "!XLogRecPtrIsInvalid(lsn)", and I believe most people write
> it with another macro like "lsn != InvalidXLogrecPtr"). Additionally,
> the constraint in this patch will be implemented as open code.  So I
> wanted to suggest something like the attached. The main idea is to use
> a wrapper function to enforce the restriction, and by doing so, we
> eliminated the need to make the variable into a ternary without a good
> reason.

So, the rationale for making it a ternary is that the variable is the
combination of two pieces of information which has only has 3 valid
states:
failsafe inactive + cost active = cost active
failsafe inactive + cost inactive = cost inactive
failsafe active + cost inactive = cost inactive and locked
the fourth is invalid
failsafe active + cost active = invalid
That is harder to enforce with two variables.
Also, the two pieces of information are not meaningful individually.
So, I thought it made sense to make a single variable.

Your suggested patch introduces an additional variable which shadows
LVRelState->failsafe_active but doesn't actually get set/reset at all of
the correct places. If we did introduce a second global variable, I
don't think we should also keep LVRelState->failsafe_active, as keeping
them in sync will be difficult.

As for the double negative (!VacuumCostInactive), I agree that it is not
ideal, however, if we use a ternary and keep VacuumCostActive, there is
no way for non-vacuum code to treat it as a boolean.
With the ternary VacuumCostInactive, only vacuum code has to know about
the distinction between inactive+failsafe active and inactive+failsafe
inactive.

As for the setter function, I think that having a function to set
VacuumCostActive based on failsafe_active is actually doing more harm
than good. Only vacuum code has to know about the distinction as it is,
so we aren't really saving any trouble (there would really only be two
callers of the suggested function). And, since the function hides
whether or not VacuumCostActive was actually set to the passed-in value,
we can't easily do other necessary maintenance -- like zero out
VacuumCostBalance if we disabled vacuum cost.

> > > 0003:
> > >
> > > +        * Reload the configuration file if requested. This allows 
> > > changes to
> > > +        * vacuum_cost_limit and vacuum_cost_delay to take effect while a 
> > > table is
> > > +        * being vacuumed or analyzed. Analyze should not reload 
> > > configuration
> > > +        * file if it is in an outer transaction, as GUC values shouldn't 
> > > be
> > > +        * allowed to refer to some uncommitted state (e.g. database 
> > > objects
> > > +        * created in this transaction).
> > >
> > > I'm not sure GUC reload is or should be related to transactions. For
> > > instance, work_mem can be changed by a reload during a transaction
> > > unless it has been set in the current transaction. I don't think we
> > > need to deliberately suppress changes in variables caused by realods
> > > during transactions only for analzye. If analyze doesn't like changes
> > > to certain GUC variables, their values should be snapshotted before
> > > starting the process.
> >
> > Currently, we only reload the config file in top-level statements. We
> > don't reload the configuration file from within a nested transaction
> > command. BEGIN starts a transaction but not a transaction command. So
> > BEGIN; ANALYZE; probably wouldn't violate this rule. But it is simpler
> > to just forbid reloading when it is not a top-level transaction command.
> > I have updated the comment to reflect this.
>
> I feel it's a bit fragile. We may not be able to manage the reload
> timeing perfectly. I think we might accidentally add a reload
> timing. In that case, the assumption could break. In most cases, I
> think we use snapshotting in various ways to avoid unintended variable
> changes. (And I beilieve the analyze code also does that.)

I'm not sure I fully understand the problem you are thinking of. What do
you mean about managing the reload timing? Are you suggesting there is a
problem with excluding analzye in an outer transaction from doing the
reload or with doing the reload during vacuum and analyze when they are
top-level statements?

And, by snapshotting do you mean how vacuum_rel() and do_analyze_rel() do
NewGUCNestLevel() so that they can then do AtEOXact_GUC() and rollback
guc changes done during that operation?
How are you envisioning that being used here?

On Wed, Mar 29, 2023 at 2:00 AM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:
>
> At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi 
> <horikyota....@gmail.com> wrote in
> > autovacuum.c:2893
> >               /*
> >                * If any of the cost delay parameters has been set 
> > individually for
> >                * this table, disable the balancing algorithm.
> >                */
> >               tab->at_dobalance =
> >                       !(avopts && (avopts->vacuum_cost_limit > 0 ||
> >                                                avopts->vacuum_cost_delay > 
> > 0));
> >
> > So, sorry for the noise. I'll review it while this into cnosideration.
>
> Then I found that the code is quite confusing as it is.
>
> For the tables that don't have cost_delay and cost_limit specified
> indificually, at_vacuum_cost_limit and _delay store the system global
> values detemined by GUCs. wi_cost_delay, _limit and _limit_base stores
> the same values with them. As the result I concluded tha
> autovac_balance_cost() does exactly what Melanie's patch does, except
> that nworkers_for_balance is not stored in shared memory.
>
> I discovered that commit 1021bd6a89 brought in do_balance.
>
> >    Since the mechanism is already complicated, just disable it for those
> >    cases rather than trying to make it cope.  There are undesirable
>
> After reading this, I get why the code is so complex.  It is a remnant
> of when balancing was done with tables that had individually specified
> cost parameters. And I found the following description in the doc.
>
> https://www.postgresql.org/docs/devel/routine-vacuuming.html
> > When multiple workers are running, the autovacuum cost delay
> > parameters (see Section 20.4.4) are “balanced” among all the running
> > workers, so that the total I/O impact on the system is the same
> > regardless of the number of workers actually running. However, any
> > workers processing tables whose per-table
> > autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage
> > parameters have been set are not considered in the balancing
> > algorithm.
>
> The initial balancing mechanism was brought in by e2a186b03c back in
> 2007. The balancing code has had that unnecessarily complexity ever
> since.
>
> Since I can't think of a better idea than Melanie's proposal for
> handling this code, I'll keep reviewing it with that approach in mind.

Thanks for doing this archaeology. I didn't know the history of dobalance
and hadn't looked into 1021bd6a89.
I was a bit confused by why dobalance was false even if only table
option cost delay is set and not table option cost limit.

I think we can retain this behavior for now, but it may be worth
re-examining in the future.

On Wed, Mar 29, 2023 at 4:35 AM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:
>
> At Wed, 29 Mar 2023 13:21:55 +0900 (JST), Kyotaro Horiguchi 
> <horikyota....@gmail.com> wrote in
> > So, sorry for the noise. I'll review it while this into cnosideration.
>
> 0003:
>
> It's not this patche's fault, but I don't like the fact that the
> variables used for GUC, VacuumCostDelay and VacuumCostLimit, are
> updated outside the GUC mechanism. Also I don't like the incorrect
> sorting of variables, where some working variables are referred to as
> GUC parameters or vise versa.
>
> Although it's somewhat unrelated to the goal of this patch, I think we
> should clean up the code tidy before proceeding. Shouldn't we separate
> the actual parameters from the GUC base variables, and sort out the
> all related variaghble? (something like the attached, on top of your
> patch.)

So, I agree we should separate the parameters used in the code from the
GUC variables -- since there are multiple users with different needs
(autovac workers, parallel vac workers, and vacuum). However, I was
hesitant to tackle that here.

I'm not sure how these changes will impact extensions that rely on
these vacuum parameters and their direct relationship to the guc values.

In your patch, you didn't update the parameter with the guc value of
vacuum_cost_limit and vacuum_cost_delay, but were we to do so, we would
need to make sure it was updated every time after a config reload. This
isn't hard to do in the current code, but I'm not sure how we can ensure
that future callers of ProcessConfigFile() in vacuum code always update
these values afterward. Perhaps we could add some after_reload hook?
Which does seem like a larger project.

> I have some comments on 0003 as-is.
>
> +               tab->at_relopt_vac_cost_limit = avopts ?
> +                       avopts->vacuum_cost_limit : 0;
> +               tab->at_relopt_vac_cost_delay = avopts ?
> +                       avopts->vacuum_cost_delay : -1;
>
> The value is not used when do_balance is false, so I don't see a
> specific reason for these variables to be different when avopts is
> null.

Actually we need to set these to 0 and -1, because we set
av_relopt_cost_limit and av_relopt_cost_delay with them and those values
are checked regardless of wi_dobalance.

We need to do this because we want to use the correct value to override
VacuumCostLimit and VacuumCostDelay. wi_dobalance may be false because
we have a table option cost delay but we have no table option cost
limit. When we override VacuumCostDelay, we want to use the table option
value but when we override VacuumCostLimit, we want to use the regular
value. We need these initialized to values that will allow us to do
that.

> +autovac_recalculate_workers_for_balance(void)
> +{
> +       dlist_iter      iter;
> +       int                     orig_nworkers_for_balance;
> +       int                     nworkers_for_balance = 0;
> +
> +       if (autovacuum_vac_cost_delay == 0 ||
> +               (autovacuum_vac_cost_delay == -1 && VacuumCostDelay == 0))
>                 return;
> +       if (autovacuum_vac_cost_limit <= 0 && VacuumCostLimit <= 0)
> +               return;
> +
>
> I'm not quite sure how these conditions relate to the need to count
> workers that shares the global I/O cost.

Ah, this is a good point, we should still keep this number up-to-date
even if the costs are disabled at the time we are checking it in case
cost-based delays are re-enabled later before we recalculate this
number.  I had this code originally because autovac_balance_cost() would
exit early if cost-based delays were disabled -- but this only worked
because they couldn't be re-enabled during vacuuming a table and
autovac_balance_cost() was called always in between vacuuming tables.

I've removed these lines.

And perhaps there is an argument for calling
autovac_recalculate_workers_for_balance() in vacuum_delay_point() after
reloading the config file...
I have not done so in attached version.

> (Though I still believe this funtion might not be necessary.)

I don't see how we can do without this function. We need an up-to-date
count of the number of autovacuum workers vacuuming tables which do not
have vacuum cost-related table options.


> +       if (av_relopt_cost_limit > 0)
> +               VacuumCostLimit = av_relopt_cost_limit;
> +       else
> +       {
> +               av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
> +                       autovacuum_vac_cost_limit : VacuumCostLimit;
> +
> +               AutoVacuumBalanceLimit();
>
> I think each worker should use MyWorkerInfo->wi_dobalance to identyify
> whether the worker needs to use balanced cost values.

Ah, there is a bug here. I have fixed it by making wi_dobalance an
atomic flag so that we can check it before calling
AutoVacuumBalanceLimit() (without taking a lock).

I don't see other (non-test code) callers using atomic flags, so I can't
tell if we need to loop to ensure that pg_atomic_test_set_flag() returns
true.

> +void
> +AutoVacuumBalanceLimit(void)
>
> I'm not sure this function needs to be a separate function.

We need to call it more often than we can call AutoVacuumUpdateLimit(),
so the logic needs to be separate. Are you suggesting we inline the
logic in the two places it is needed?

v11 attached with updates mentioned above.

- Melanie
From a1db301c122641acd297a05d29bcb32bc7b769e2 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sat, 25 Mar 2023 14:14:55 -0400
Subject: [PATCH v11 3/3] Autovacuum refreshes cost-based delay params more
 often

The previous commit allowed VACUUM to reload the config file more often
so that cost-based delay parameters could take effect while VACUUMing a
relation. Autovacuum, however did not benefit from this change.

In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.

Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() is called, workers
vacuuming tables with no table options could still have different values
for their wi_cost_limit_base and wi_cost_delay.

Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of table
options). This removes the rationale for keeping cost limit and cost
delay in shared memory. Balancing the cost limit requires only the
number of active autovacuum workers vacuuming a table with no cost-based
table options.

Reviewed-by: Kyotaro Horiguchi <horikyota....@gmail.com>
---
 src/backend/commands/vacuum.c       |  22 ++-
 src/backend/postmaster/autovacuum.c | 286 +++++++++++++++-------------
 src/include/postmaster/autovacuum.h |   2 +
 3 files changed, 175 insertions(+), 135 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7e3a8e404e..e9b683805a 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2241,10 +2241,10 @@ vacuum_delay_point(void)
 
 	/*
 	 * Reload the configuration file if requested. This allows changes to
-	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
-	 * being vacuumed or analyzed. Analyze should not reload configuration file
-	 * if it is in an outer transaction, as we currently only allow
-	 * configuration reload when in top-level statements.
+	 * [autovacuum_]vacuum_cost_limit and [autovacuum_]vacuum_cost_delay to
+	 * take effect while a table is being vacuumed or analyzed. Analyze should
+	 * not reload configuration file if it is in an outer transaction, as we
+	 * currently only allow configuration reload when in top-level statements.
 	 */
 	if (ConfigReloadPending && !analyze_in_outer_xact)
 	{
@@ -2257,10 +2257,12 @@ vacuum_delay_point(void)
 		 * by reload.
 		 */
 		AutoVacuumUpdateDelay();
+		AutoVacuumUpdateLimit();
 
 		/*
 		 * If configuration changes are allowed to impact VacuumCostInactive,
-		 * make sure it is updated.
+		 * make sure it is updated. Autovacuum workers will have already done
+		 * this in AutoVacuumUpdateDelay()
 		 */
 		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
 			return;
@@ -2311,8 +2313,14 @@ vacuum_delay_point(void)
 
 		VacuumCostBalance = 0;
 
-		/* update balance values for workers */
-		AutoVacuumUpdateDelay();
+		/*
+		 * Balance and update limit values for autovacuum workers. We must
+		 * always do this in case the autovacuum launcher or another
+		 * autovacuum worker has recalculated the number of workers across
+		 * which we must balance the limit. This is done by the launcher when
+		 * launching a new worker and by workers before vacuuming each table.
+		 */
+		AutoVacuumBalanceLimit();
 
 		/* Might have gotten an interrupt while sleeping */
 		CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 585d28148c..9775764fc4 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -139,6 +139,10 @@ int			Log_autovacuum_min_duration = 600000;
 static bool am_autovacuum_launcher = false;
 static bool am_autovacuum_worker = false;
 
+static double av_relopt_cost_delay = -1;
+static int	av_relopt_cost_limit = 0;
+static int	av_base_cost_limit = 0;
+
 /* Flags set by signal handlers */
 static volatile sig_atomic_t got_SIGUSR2 = false;
 
@@ -189,8 +193,8 @@ typedef struct autovac_table
 {
 	Oid			at_relid;
 	VacuumParams at_params;
-	double		at_vacuum_cost_delay;
-	int			at_vacuum_cost_limit;
+	double		at_relopt_vac_cost_delay;
+	int			at_relopt_vac_cost_limit;
 	bool		at_dobalance;
 	bool		at_sharedrel;
 	char	   *at_relname;
@@ -209,7 +213,7 @@ typedef struct autovac_table
  * wi_sharedrel flag indicating whether table is marked relisshared
  * wi_proc		pointer to PGPROC of the running worker, NULL if not started
  * wi_launchtime Time at which this worker was launched
- * wi_cost_*	Vacuum cost-based delay parameters current in this worker
+ * wi_dobalance Whether this worker should be included in balance calculations
  *
  * All fields are protected by AutovacuumLock, except for wi_tableoid and
  * wi_sharedrel which are protected by AutovacuumScheduleLock (note these
@@ -223,11 +227,8 @@ typedef struct WorkerInfoData
 	Oid			wi_tableoid;
 	PGPROC	   *wi_proc;
 	TimestampTz wi_launchtime;
-	bool		wi_dobalance;
+	pg_atomic_flag wi_dobalance;
 	bool		wi_sharedrel;
-	double		wi_cost_delay;
-	int			wi_cost_limit;
-	int			wi_cost_limit_base;
 } WorkerInfoData;
 
 typedef struct WorkerInfoData *WorkerInfo;
@@ -273,6 +274,8 @@ typedef struct AutoVacuumWorkItem
  * av_startingWorker pointer to WorkerInfo currently being started (cleared by
  *					the worker itself as soon as it's up and running)
  * av_workItems		work item array
+ * av_nworkersForBalance the number of autovacuum workers to use when
+ * 					calculating the per worker cost limit
  *
  * This struct is protected by AutovacuumLock, except for av_signal and parts
  * of the worker list (see above).
@@ -286,6 +289,7 @@ typedef struct
 	dlist_head	av_runningWorkers;
 	WorkerInfo	av_startingWorker;
 	AutoVacuumWorkItem av_workItems[NUM_WORKITEMS];
+	pg_atomic_uint32 av_nworkersForBalance;
 } AutoVacuumShmemStruct;
 
 static AutoVacuumShmemStruct *AutoVacuumShmem;
@@ -319,7 +323,7 @@ static void launch_worker(TimestampTz now);
 static List *get_database_list(void);
 static void rebuild_database_list(Oid newdb);
 static int	db_comparator(const void *a, const void *b);
-static void autovac_balance_cost(void);
+static void autovac_recalculate_workers_for_balance(void);
 
 static void do_autovacuum(void);
 static void FreeWorkerInfo(int code, Datum arg);
@@ -670,7 +674,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			{
 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
-				autovac_balance_cost();
+				autovac_recalculate_workers_for_balance();
 				LWLockRelease(AutovacuumLock);
 			}
 
@@ -820,8 +824,8 @@ HandleAutoVacLauncherInterrupts(void)
 			AutoVacLauncherShutdown();
 
 		/* rebalance in case the default cost parameters changed */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
 		LWLockRelease(AutovacuumLock);
 
 		/* rebuild the list in case the naptime changed */
@@ -1755,10 +1759,7 @@ FreeWorkerInfo(int code, Datum arg)
 		MyWorkerInfo->wi_sharedrel = false;
 		MyWorkerInfo->wi_proc = NULL;
 		MyWorkerInfo->wi_launchtime = 0;
-		MyWorkerInfo->wi_dobalance = false;
-		MyWorkerInfo->wi_cost_delay = 0;
-		MyWorkerInfo->wi_cost_limit = 0;
-		MyWorkerInfo->wi_cost_limit_base = 0;
+		pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 		dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 						&MyWorkerInfo->wi_links);
 		/* not mine anymore */
@@ -1773,100 +1774,140 @@ FreeWorkerInfo(int code, Datum arg)
 	}
 }
 
+
 /*
- * Update the cost-based delay parameters, so that multiple workers consume
- * each a fraction of the total available I/O.
+ * Update VacuumCostDelay with the correct value for an autovacuum worker,
+ * given the value of other relevant cost-based delay parameters. Autovacuum
+ * workers should call this after every config reload, in case VacuumCostDelay
+ * was overwritten.
  */
 void
 AutoVacuumUpdateDelay(void)
 {
-	if (MyWorkerInfo)
+	if (!am_autovacuum_worker)
+		return;
+
+	if (av_relopt_cost_delay >= 0)
+		VacuumCostDelay = av_relopt_cost_delay;
+	else if (autovacuum_vac_cost_delay >= 0)
+		VacuumCostDelay = autovacuum_vac_cost_delay;
+
+	/*
+	 * If configuration changes are allowed to impact VacuumCostInactive, make
+	 * sure it is updated.
+	 */
+	if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+		return;
+
+	if (VacuumCostDelay > 0)
+		VacuumCostInactive = VACUUM_COST_ACTIVE;
+	else
 	{
-		VacuumCostDelay = MyWorkerInfo->wi_cost_delay;
-		VacuumCostLimit = MyWorkerInfo->wi_cost_limit;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 }
 
+
 /*
- * autovac_balance_cost
- *		Recalculate the cost limit setting for each active worker.
- *
- * Caller must hold the AutovacuumLock in exclusive mode.
+ * This must be called directly after a config reload before using the value of
+ * VacuumCostLimit and before calling AutoVacuumBalanceLimit(), as it uses the
+ * value of VacuumCostLimit to determine what the base av_base_cost_limit
+ * should be. AutoVacuumBalanceLimit() will override the value of
+ * VacuumCostLimit, so calling it multiple times after a config reload is
+ * incorrect.
  */
-static void
-autovac_balance_cost(void)
+void
+AutoVacuumUpdateLimit(void)
 {
+	if (!am_autovacuum_worker)
+		return;
+
 	/*
-	 * The idea here is that we ration out I/O equally.  The amount of I/O
-	 * that a worker can consume is determined by cost_limit/cost_delay, so we
-	 * try to equalize those ratios rather than the raw limit settings.
-	 *
 	 * note: in cost_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								  autovacuum_vac_cost_limit : VacuumCostLimit);
-	double		vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
-								  autovacuum_vac_cost_delay : VacuumCostDelay);
-	double		cost_total;
-	double		cost_avail;
-	dlist_iter	iter;
-
-	/* not set? nothing to do */
-	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
-		return;
-
-	/* calculate the total base cost limit of participating active workers */
-	cost_total = 0.0;
-	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
+	if (av_relopt_cost_limit > 0)
+		VacuumCostLimit = av_relopt_cost_limit;
+	else
 	{
-		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
+		av_base_cost_limit = autovacuum_vac_cost_limit > 0 ?
+			autovacuum_vac_cost_limit : VacuumCostLimit;
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-			cost_total +=
-				(double) worker->wi_cost_limit_base / worker->wi_cost_delay;
+		AutoVacuumBalanceLimit();
 	}
+}
 
-	/* there are no cost limits -- nothing to do */
-	if (cost_total <= 0)
+/*
+ * Update VacuumCostLimit with the correct value for an autovacuum worker,
+ * given the value of other relevant cost limit parameters and the number of
+ * workers across which the limit must be balanced. Autovacuum workers must
+ * call this regularly in case av_nworkers_for_balance has been updated by
+ * another worker or by the autovacuum launcher. After a config reload, they
+ * must call AutoVacuumUpdateLimit() which will call AutoVacuumBalanceLimit(),
+ * in case VacuumCostLimit was overwritten.
+ */
+void
+AutoVacuumBalanceLimit(void)
+{
+	int			nworkers_for_balance;
+	int			total_cost_limit;
+	int			balanced_cost_limit;
+
+	if (!MyWorkerInfo)
 		return;
 
-	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
-	 */
-	cost_avail = (double) vac_cost_limit / vac_cost_delay;
+	if (pg_atomic_unlocked_test_flag(&MyWorkerInfo->wi_dobalance))
+		return;
+
+	Assert(av_base_cost_limit > 0);
+
+	nworkers_for_balance = pg_atomic_read_u32(
+							&AutoVacuumShmem->av_nworkersForBalance);
+
+	/* There is at least 1 autovac worker (this worker). */
+	Assert(nworkers_for_balance > 0);
+
+	total_cost_limit = autovacuum_vac_cost_limit > 0 ?
+		autovacuum_vac_cost_limit : av_base_cost_limit;
+
+	balanced_cost_limit = total_cost_limit / nworkers_for_balance;
+
+	VacuumCostLimit = Max(Min(balanced_cost_limit, total_cost_limit), 1);
+}
+
+/*
+ * autovac_recalculate_workers_for_balance
+ *		Recalculate the number of workers to consider, given table options and
+ *		the current number of active workers.
+ *
+ * Caller must hold the AutovacuumLock in at least shared mode to access
+ * worker->wi_proc.
+ */
+static void
+autovac_recalculate_workers_for_balance(void)
+{
+	dlist_iter	iter;
+	int			orig_nworkers_for_balance;
+	int			nworkers_for_balance = 0;
+
+	orig_nworkers_for_balance =
+		pg_atomic_read_u32(&AutoVacuumShmem->av_nworkersForBalance);
+
 	dlist_foreach(iter, &AutoVacuumShmem->av_runningWorkers)
 	{
 		WorkerInfo	worker = dlist_container(WorkerInfoData, wi_links, iter.cur);
 
-		if (worker->wi_proc != NULL &&
-			worker->wi_dobalance &&
-			worker->wi_cost_limit_base > 0 && worker->wi_cost_delay > 0)
-		{
-			int			limit = (int)
-			(cost_avail * worker->wi_cost_limit_base / cost_total);
-
-			/*
-			 * We put a lower bound of 1 on the cost_limit, to avoid division-
-			 * by-zero in the vacuum code.  Also, in case of roundoff trouble
-			 * in these calculations, let's be sure we don't ever set
-			 * cost_limit to more than the base value.
-			 */
-			worker->wi_cost_limit = Max(Min(limit,
-											worker->wi_cost_limit_base),
-										1);
-		}
+		if (worker->wi_proc == NULL ||
+			pg_atomic_unlocked_test_flag(&worker->wi_dobalance))
+			continue;
 
-		if (worker->wi_proc != NULL)
-			elog(DEBUG2, "autovac_balance_cost(pid=%d db=%u, rel=%u, dobalance=%s cost_limit=%d, cost_limit_base=%d, cost_delay=%g)",
-				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_dobalance ? "yes" : "no",
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
-				 worker->wi_cost_delay);
+		nworkers_for_balance++;
 	}
+
+	if (nworkers_for_balance != orig_nworkers_for_balance)
+		pg_atomic_write_u32(&AutoVacuumShmem->av_nworkersForBalance,
+							nworkers_for_balance);
 }
 
 /*
@@ -2312,8 +2353,6 @@ do_autovacuum(void)
 		autovac_table *tab;
 		bool		isshared;
 		bool		skipit;
-		double		stdVacuumCostDelay;
-		int			stdVacuumCostLimit;
 		dlist_iter	iter;
 
 		CHECK_FOR_INTERRUPTS();
@@ -2417,30 +2456,29 @@ do_autovacuum(void)
 		}
 
 		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.  We have to
-		 * restore these at the bottom of the loop, else we'll compute wrong
-		 * values in the next iteration of autovac_balance_cost().
+		 * Save the cost-related table options in global variables for
+		 * reference when updating VacuumCostLimit and VacuumCostDelay during
+		 * vacuuming this table.
 		 */
-		stdVacuumCostDelay = VacuumCostDelay;
-		stdVacuumCostLimit = VacuumCostLimit;
+		av_relopt_cost_limit = tab->at_relopt_vac_cost_limit;
+		av_relopt_cost_delay = tab->at_relopt_vac_cost_delay;
 
-		/* Must hold AutovacuumLock while mucking with cost balance info */
-		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
-
-		/* advertise my cost delay parameters for the balancing algorithm */
-		MyWorkerInfo->wi_dobalance = tab->at_dobalance;
-		MyWorkerInfo->wi_cost_delay = tab->at_vacuum_cost_delay;
-		MyWorkerInfo->wi_cost_limit = tab->at_vacuum_cost_limit;
-		MyWorkerInfo->wi_cost_limit_base = tab->at_vacuum_cost_limit;
+		if (tab->at_dobalance)
+			pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
+		else
+			pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance);
 
-		/* do a balance */
-		autovac_balance_cost();
+		LWLockAcquire(AutovacuumLock, LW_SHARED);
+		autovac_recalculate_workers_for_balance();
+		LWLockRelease(AutovacuumLock);
 
-		/* set the active cost parameters from the result of that */
+		/*
+		 * We wait until this point to update cost delay and cost limit
+		 * values, even though we reloaded the configuration file above, so
+		 * that we can take into account the cost-related table options.
+		 */
 		AutoVacuumUpdateDelay();
-
-		/* done */
-		LWLockRelease(AutovacuumLock);
+		AutoVacuumUpdateLimit();
 
 		/* clean up memory before each iteration */
 		MemoryContextResetAndDeleteChildren(PortalContext);
@@ -2525,19 +2563,15 @@ deleted:
 
 		/*
 		 * Remove my info from shared memory.  We could, but intentionally
-		 * don't, clear wi_cost_limit and friends --- this is on the
-		 * assumption that we probably have more to do with similar cost
-		 * settings, so we don't want to give up our share of I/O for a very
-		 * short interval and thereby thrash the global balance.
+		 * don't, unset wi_dobalance on the assumption that we are more likely
+		 * than not to vacuum a table with no table options next, so we don't
+		 * want to give up our share of I/O for a very short interval and
+		 * thereby thrash the global balance.
 		 */
 		LWLockAcquire(AutovacuumScheduleLock, LW_EXCLUSIVE);
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		MyWorkerInfo->wi_sharedrel = false;
 		LWLockRelease(AutovacuumScheduleLock);
-
-		/* restore vacuum cost GUCs for the next iteration */
-		VacuumCostDelay = stdVacuumCostDelay;
-		VacuumCostLimit = stdVacuumCostLimit;
 	}
 
 	/*
@@ -2569,6 +2603,8 @@ deleted:
 		{
 			ConfigReloadPending = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			AutoVacuumUpdateDelay();
+			AutoVacuumUpdateLimit();
 		}
 
 		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
@@ -2804,8 +2840,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		int			freeze_table_age;
 		int			multixact_freeze_min_age;
 		int			multixact_freeze_table_age;
-		int			vac_cost_limit;
-		double		vac_cost_delay;
 		int			log_min_duration;
 
 		/*
@@ -2815,20 +2849,6 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		 * defaults, autovacuum's own first and plain vacuum second.
 		 */
 
-		/* -1 in autovac setting means use plain vacuum_cost_delay */
-		vac_cost_delay = (avopts && avopts->vacuum_cost_delay >= 0)
-			? avopts->vacuum_cost_delay
-			: (autovacuum_vac_cost_delay >= 0)
-			? autovacuum_vac_cost_delay
-			: VacuumCostDelay;
-
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
-
 		/* -1 in autovac setting means use log_autovacuum_min_duration */
 		log_min_duration = (avopts && avopts->log_min_duration >= 0)
 			? avopts->log_min_duration
@@ -2884,8 +2904,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 		tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
 		tab->at_params.is_wraparound = wraparound;
 		tab->at_params.log_min_duration = log_min_duration;
-		tab->at_vacuum_cost_limit = vac_cost_limit;
-		tab->at_vacuum_cost_delay = vac_cost_delay;
+		tab->at_relopt_vac_cost_limit = avopts ?
+			avopts->vacuum_cost_limit : 0;
+		tab->at_relopt_vac_cost_delay = avopts ?
+			avopts->vacuum_cost_delay : -1;
 		tab->at_relname = NULL;
 		tab->at_nspname = NULL;
 		tab->at_datname = NULL;
@@ -3377,10 +3399,18 @@ AutoVacuumShmemInit(void)
 		worker = (WorkerInfo) ((char *) AutoVacuumShmem +
 							   MAXALIGN(sizeof(AutoVacuumShmemStruct)));
 
+
 		/* initialize the WorkerInfo free list */
 		for (i = 0; i < autovacuum_max_workers; i++)
+		{
 			dlist_push_head(&AutoVacuumShmem->av_freeWorkers,
 							&worker[i].wi_links);
+
+			pg_atomic_init_flag(&worker[i].wi_dobalance);
+		}
+
+		pg_atomic_init_u32(&AutoVacuumShmem->av_nworkersForBalance, 0);
+
 	}
 	else
 		Assert(found);
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index c140371b51..80bdfb2cc0 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -64,7 +64,9 @@ extern int	StartAutoVacWorker(void);
 extern void AutoVacWorkerFailed(void);
 
 /* autovacuum cost-delay balancer */
+extern void AutoVacuumBalanceLimit(void);
 extern void AutoVacuumUpdateDelay(void);
+extern void AutoVacuumUpdateLimit(void);
 
 #ifdef EXEC_BACKEND
 extern void AutoVacLauncherMain(int argc, char *argv[]) pg_attribute_noreturn();
-- 
2.37.2

From a46f13517d165479ec518b46fe98591aa75f35d6 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Sat, 25 Mar 2023 12:05:18 -0400
Subject: [PATCH v11 1/3] Make VacuumCostActive failsafe-aware

While vacuuming a table in failsafe mode, VacuumCostActive should not be
re-enabled. This currently isn't a problem because vacuum cost
parameters are only refreshed in between vacuuming tables and failsafe
status is reset for every table. In preparation for allowing vacuum cost
parameters to be updated more frequently, make vacuum cost status more
expressive.

VacuumCostActive is now VacuumCostInactive, as it can only be active in
one way but it can be inactive in two ways. If performing a failsafe
vacuum, the vacuum cost status cannot be enabled and is effectively
"locked". If performing a non-failsafe vacuum, the vacuum cost status
may be active or inactive. To express this, VacuumCostInactive can be
one of three statuses: VACUUM_COST_INACTIVE_AND_LOCKED,
VACUUM_COST_ACTIVE_AND_LOCKED, and VACUUM_COST_ACTIVE.

VacuumCostInactive is defined as an integer because we do not want
non-vacuum code concerning itself with the distinction between the three
statuses -- only with whether or not VacuumCostInactive == 0 or not.
---
 src/backend/access/heap/vacuumlazy.c  |  2 +-
 src/backend/commands/vacuum.c         | 24 +++++++++++++++++++++---
 src/backend/commands/vacuumparallel.c |  8 ++++++--
 src/backend/storage/buffer/bufmgr.c   |  8 ++++----
 src/backend/utils/init/globals.c      |  2 +-
 src/include/commands/vacuum.h         |  8 ++++++++
 src/include/miscadmin.h               |  3 +--
 7 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..040a4e931b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,7 +2637,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 						 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
 
 		/* Stop applying cost limits from this point on */
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_LOCKED;
 		VacuumCostBalance = 0;
 
 		return true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..eb126f2247 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -491,7 +491,6 @@ vacuum(List *relations, VacuumParams *params,
 		ListCell   *cur;
 
 		in_vacuum = true;
-		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
@@ -507,6 +506,24 @@ vacuum(List *relations, VacuumParams *params,
 		{
 			VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
 
+			/*
+			 * failsafe_active is reset per relation, so we must be sure that
+			 * VacuumCostInactive is set to either VACUUM_COST_INACTIVE or
+			 * VACUUM_COST_INACTIVE_AND_UNLOCKED in between vacuuming
+			 * relations.
+			 */
+			VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+				VACUUM_COST_INACTIVE_AND_UNLOCKED;
+
+			/*
+			 * We should not have transitioned VacuumCostInactive from
+			 * VACUUM_COST_ACTIVE to VACUUM_COST_INACTIVE_AND_UNLOCKED above,
+			 * as that should have happened when we changed the value of
+			 * VacuumCostDelay.
+			 */
+			Assert(VacuumCostInactive == VACUUM_COST_ACTIVE ||
+				   VacuumCostBalance == 0);
+
 			if (params->options & VACOPT_VACUUM)
 			{
 				if (!vacuum_rel(vrel->oid, vrel->relation, params, false))
@@ -549,7 +566,8 @@ vacuum(List *relations, VacuumParams *params,
 	PG_FINALLY();
 	{
 		in_vacuum = false;
-		VacuumCostActive = false;
+		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+		VacuumCostBalance = 0;
 	}
 	PG_END_TRY();
 
@@ -2215,7 +2233,7 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (!VacuumCostActive || InterruptPending)
+	if (VacuumCostInactive || InterruptPending)
 		return;
 
 	/*
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index bcd40c80a1..266bf6bb4c 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -989,8 +989,12 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 												 PARALLEL_VACUUM_KEY_DEAD_ITEMS,
 												 false);
 
-	/* Set cost-based vacuum delay */
-	VacuumCostActive = (VacuumCostDelay > 0);
+	/*
+	 * Set cost-based vacuum delay Parallel vacuum workers will not execute
+	 * failsafe VACUUM.
+	 */
+	VacuumCostInactive = VacuumCostDelay > 0 ? VACUUM_COST_ACTIVE :
+		VACUUM_COST_INACTIVE_AND_UNLOCKED;
 	VacuumCostBalance = 0;
 	VacuumPageHit = 0;
 	VacuumPageMiss = 0;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index fe029d2ea6..495ee8f815 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -893,7 +893,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			*hit = true;
 			VacuumPageHit++;
 
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageHit;
 
 			TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1098,7 +1098,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	}
 
 	VacuumPageMiss++;
-	if (VacuumCostActive)
+	if (!VacuumCostInactive)
 		VacuumCostBalance += VacuumCostPageMiss;
 
 	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
@@ -1672,7 +1672,7 @@ MarkBufferDirty(Buffer buffer)
 	{
 		VacuumPageDirty++;
 		pgBufferUsage.shared_blks_dirtied++;
-		if (VacuumCostActive)
+		if (!VacuumCostInactive)
 			VacuumCostBalance += VacuumCostPageDirty;
 	}
 }
@@ -4189,7 +4189,7 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
 		{
 			VacuumPageDirty++;
 			pgBufferUsage.shared_blks_dirtied++;
-			if (VacuumCostActive)
+			if (!VacuumCostInactive)
 				VacuumCostBalance += VacuumCostPageDirty;
 		}
 	}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..608ebb9182 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -150,4 +150,4 @@ int64		VacuumPageMiss = 0;
 int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
-bool		VacuumCostActive = false;
+int			VacuumCostInactive = 1;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bdfd96cfec..5c3e250b06 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -302,6 +302,14 @@ extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
 /* Variables for cost-based parallel vacuum */
+
+typedef enum VacuumCostStatus
+{
+	VACUUM_COST_INACTIVE_AND_LOCKED = -1,
+	VACUUM_COST_ACTIVE = 0,
+	VACUUM_COST_INACTIVE_AND_UNLOCKED = 1,
+}			VacuumCostStatus;
+
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 06a86f9ac1..33e22733ae 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -274,8 +274,7 @@ extern PGDLLIMPORT int64 VacuumPageMiss;
 extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
-extern PGDLLIMPORT bool VacuumCostActive;
-
+extern PGDLLIMPORT int VacuumCostInactive;
 
 /* in tcop/postgres.c */
 
-- 
2.37.2

From 3f4a39ff366f491ecdf2361ae11704bc98175a30 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Mon, 27 Mar 2023 13:33:19 -0400
Subject: [PATCH v11 2/3] VACUUM reloads config file more often

Previously, VACUUM would not reload the configuration file. So, changes
to cost-based delay parameters could only take effect on the next
invocation of VACUUM.

Check if a reload is pending roughly once per block now, when checking
if we need to delay.

Note that autovacuum is unaffected by this change. Autovacuum workers
overwrite the value of VacuumCostLimit and VacuumCostDelay with their
own WorkerInfo->wi_cost_limit and wi_cost_delay. Writing to their
wi_cost_delay more often makes reading wi_cost_delay without a lock to
update VacuumCostDelay an even worse idea.

Reviewed-by: Masahiko Sawada <sawada.m...@gmail.com>
Reviewed-by: Daniel Gustafsson <dan...@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_buP5wzsho3qNw5o9_R0pF69FRM5hgCmr-mvXmGXwdA7A%40mail.gmail.com#5e6771d4cdca4db6efc2acec2dce0bc7
---
 src/backend/commands/vacuum.c | 61 ++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index eb126f2247..7e3a8e404e 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "pgstat.h"
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
+#include "postmaster/interrupt.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -76,6 +77,7 @@ int			vacuum_multixact_failsafe_age;
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
+static bool analyze_in_outer_xact = false;
 
 
 /*
@@ -314,8 +316,7 @@ vacuum(List *relations, VacuumParams *params,
 	static bool in_vacuum = false;
 
 	const char *stmttype;
-	volatile bool in_outer_xact,
-				use_own_xacts;
+	volatile bool use_own_xacts;
 
 	Assert(params != NULL);
 
@@ -332,10 +333,10 @@ vacuum(List *relations, VacuumParams *params,
 	if (params->options & VACOPT_VACUUM)
 	{
 		PreventInTransactionBlock(isTopLevel, stmttype);
-		in_outer_xact = false;
+		analyze_in_outer_xact = false;
 	}
 	else
-		in_outer_xact = IsInTransactionBlock(isTopLevel);
+		analyze_in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
 	 * Due to static variables vac_context, anl_context and vac_strategy,
@@ -457,7 +458,7 @@ vacuum(List *relations, VacuumParams *params,
 		Assert(params->options & VACOPT_ANALYZE);
 		if (IsAutoVacuumWorkerProcess())
 			use_own_xacts = true;
-		else if (in_outer_xact)
+		else if (analyze_in_outer_xact)
 			use_own_xacts = false;
 		else if (list_length(relations) > 1)
 			use_own_xacts = true;
@@ -475,7 +476,7 @@ vacuum(List *relations, VacuumParams *params,
 	 */
 	if (use_own_xacts)
 	{
-		Assert(!in_outer_xact);
+		Assert(!analyze_in_outer_xact);
 
 		/* ActiveSnapshot is not set by autovacuum */
 		if (ActiveSnapshotSet())
@@ -544,7 +545,7 @@ vacuum(List *relations, VacuumParams *params,
 				}
 
 				analyze_rel(vrel->oid, vrel->relation, params,
-							vrel->va_cols, in_outer_xact, vac_strategy);
+							vrel->va_cols, analyze_in_outer_xact, vac_strategy);
 
 				if (use_own_xacts)
 				{
@@ -568,6 +569,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_vacuum = false;
 		VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
 		VacuumCostBalance = 0;
+		analyze_in_outer_xact = false;
 	}
 	PG_END_TRY();
 
@@ -2233,7 +2235,50 @@ vacuum_delay_point(void)
 	/* Always check for interrupts */
 	CHECK_FOR_INTERRUPTS();
 
-	if (VacuumCostInactive || InterruptPending)
+	if (InterruptPending ||
+		(VacuumCostInactive && !ConfigReloadPending))
+		return;
+
+	/*
+	 * Reload the configuration file if requested. This allows changes to
+	 * vacuum_cost_limit and vacuum_cost_delay to take effect while a table is
+	 * being vacuumed or analyzed. Analyze should not reload configuration file
+	 * if it is in an outer transaction, as we currently only allow
+	 * configuration reload when in top-level statements.
+	 */
+	if (ConfigReloadPending && !analyze_in_outer_xact)
+	{
+		ConfigReloadPending = false;
+		ProcessConfigFile(PGC_SIGHUP);
+
+		/*
+		 * Autovacuum workers must restore the correct values of
+		 * VacuumCostLimit and VacuumCostDelay in case they were overwritten
+		 * by reload.
+		 */
+		AutoVacuumUpdateDelay();
+
+		/*
+		 * If configuration changes are allowed to impact VacuumCostInactive,
+		 * make sure it is updated.
+		 */
+		if (VacuumCostInactive == VACUUM_COST_INACTIVE_AND_LOCKED)
+			return;
+
+		if (VacuumCostDelay > 0)
+			VacuumCostInactive = VACUUM_COST_ACTIVE;
+		else
+		{
+			VacuumCostInactive = VACUUM_COST_INACTIVE_AND_UNLOCKED;
+			VacuumCostBalance = 0;
+		}
+	}
+
+	/*
+	 * If we disabled cost-based delays after reloading the config file,
+	 * return.
+	 */
+	if (VacuumCostInactive)
 		return;
 
 	/*
-- 
2.37.2

Reply via email to