I attach a patch to fix some issues in the autovac process handling
code.  Mainly, instead of rechecking periodically in the launcher
whether a worker was able to start or not, what we do is signal it from
the postmaster when it fails.

In order to do this I had to make the postmaster set a flag in shared
memory, much like the pmsignal.c code does.  According to previous
discussions this is an acceptable thing for postmaster to do.

I must admit I did not try to fill shared memory with garbage and see if
it caused a postmaster crash.  What I did test was creating a database,
making sure that the launcher was picking it up, and then delete the
database's directory.  This correctly causes a worker to log errors when
trying to connect, and it is handled well -- i.e. other databases are
not starved.  (One remaining problem I see is that a database in this
state, or similar, that fails to be vacuumed, *will* cause starvation if
in danger of Xid wraparound).

Another thing I noticed is that I can change the "autovacuum" parameter
in postgresql.conf and send SIGHUP to postmaster, which correctly
starts the launcher if it was previously disabled.  To make the shutdown
case act accordingly, I made the launcher check the "start daemon" flag
after rereading the config file, and quit if it's off.  I tried multiple
combinations of having it set and unset, changing
stats_collect_tuple_level and it works fine.

One problem with the patch is this (new code):

    bn = (Backend *) malloc(sizeof(Backend));
!   if (bn)
    {
!       bn->pid = StartAutoVacWorker();
!       bn->is_autovacuum = true;
!       /* we don't need a cancel key */
  
!       if (bn->pid > 0)
!       {
!           /* FIXME -- unchecked memory allocation here */
!           DLAddHead(BackendList, DLNewElem(bn));


If the palloc() inside DLNewElem fails, we will fail to report a "fork
failure" to the launcher.  I am not sure how serious this is.  One idea
that came to mind was using a PG_TRY block, sending the signal in the
CATCH block, and then rethrowing the exception.  Is this acceptable?


All in all, this patch increases the reliability of autovacuum, so I
intend to apply it shortly unless there are objections.  Further
improvements might come later as I become aware of possible fixes.

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)
Index: src/backend/postmaster/autovacuum.c
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.50
diff -c -p -r1.50 autovacuum.c
*** src/backend/postmaster/autovacuum.c	13 Jun 2007 21:24:55 -0000	1.50
--- src/backend/postmaster/autovacuum.c	19 Jun 2007 17:30:00 -0000
***************
*** 4,16 ****
   *
   * PostgreSQL Integrated Autovacuum Daemon
   *
   *
   * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
   *
   * IDENTIFICATION
!  *	  $PostgreSQL: pgsql/src/backend/postmaster/autovacuum.c,v 1.50 2007-06-13 21:24:55 alvherre Exp $
   *
   *-------------------------------------------------------------------------
   */
--- 4,52 ----
   *
   * PostgreSQL Integrated Autovacuum Daemon
   *
+  * The autovacuum system is structured in two different kinds of processes: the
+  * autovacuum launcher and the autovacuum worker.  The launcher is an
+  * always-running process, started by the postmaster when the autovacuum GUC
+  * parameter is set.  The launcher schedules autovacuum workers to be started
+  * when appropriate.  The workers are the processes which execute the actual
+  * vacuuming; they connect to a database as determined in the launcher, and
+  * once connected they examine the catalogs to select the tables to vacuum.
+  *
+  * The autovacuum launcher cannot start the worker processes by itself,
+  * because doing so would cause robustness issues (namely, failure to shut
+  * them down on exceptional conditions; and apart from that, since the launcher
+  * is connected to shared memory and is thus subject to corruption there, it is
+  * not as robust as the postmaster which is not).  So it leaves that task to
+  * the postmaster.
+  *
+  * There is an autovacuum shared memory area, where the launcher stores
+  * information about the database it wants vacuumed.  When it wants a new
+  * worker to start, it sets a flag in shared memory and sends a special signal
+  * to the postmaster.  Then postmaster knows nothing more than it must start a
+  * worker; so it forks a new child, which turns into a worker and examines
+  * shared memory to know what database it must connect to.
+  *
+  * If the fork() call fails, the postmaster sets a flag in the shared memory
+  * area, and it sends a signal to the launcher.  The launcher, upon noticing
+  * the flag, can try starting the worker again.  Note that the failure can only
+  * be transient (fork failure due to high load, memory pressure, too many
+  * processes, etc); more permanent problems, like failure to connect to a
+  * database, are detected later in the worker and dealt with in a different
+  * way.
+  *
+  * When the worker is done vacuuming it sends SIGUSR1 to the launcher.  The
+  * launcher then wakes up and is able to launch another worker, if the schedule
+  * is too tight.  Otherwise it will go back to sleeping and awake normally
+  * according to schedule.  At this time the launcher can also balance the
+  * settings for the various remaining workers' cost-based vacuum delay feature.
+  *
   *
   * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
   *
   * IDENTIFICATION
!  * $PostgreSQL: pgsql/src/backend/postmaster/autovacuum.c,v 1.50 2007-06-13 21:24:55 alvherre Exp $
   *
   *-------------------------------------------------------------------------
   */
*************** typedef struct WorkerInfoData
*** 158,185 ****
  
  typedef struct WorkerInfoData *WorkerInfo;
  
  /*-------------
   * The main autovacuum shmem struct.  On shared memory we store this main
   * struct and the array of WorkerInfo structs.  This struct keeps:
   *
   * av_launcherpid	the PID of the autovacuum launcher
   * av_freeWorkers	the WorkerInfo freelist
   * av_runningWorkers the WorkerInfo non-free queue
   * av_startingWorker pointer to WorkerInfo currently being started (cleared by
   *					the worker itself as soon as it's up and running)
-  * av_rebalance		true when a worker determines that cost limits must be
-  * 					rebalanced
   *
!  * This struct is protected by AutovacuumLock.
   *-------------
   */
  typedef struct
  {
  	pid_t			av_launcherpid;
  	SHMEM_OFFSET	av_freeWorkers;
  	SHM_QUEUE		av_runningWorkers;
  	SHMEM_OFFSET	av_startingWorker;
- 	bool			av_rebalance;
  } AutoVacuumShmemStruct;
  
  static AutoVacuumShmemStruct *AutoVacuumShmem;
--- 194,233 ----
  
  typedef struct WorkerInfoData *WorkerInfo;
  
+ /*
+  * Possible signals received by the launcher from remote processes.  These are
+  * stored atomically in shared memory so that other processes can set them
+  * without locking.
+  */
+ typedef enum 
+ {
+ 	AutoVacForkFailed,	/* failed trying to start a worker */
+ 	AutoVacRebalance,	/* rebalance the cost limits */
+ 	AutoVacNumSignals = AutoVacRebalance	/* must be last */
+ } AutoVacuumSignal;
+ 
  /*-------------
   * The main autovacuum shmem struct.  On shared memory we store this main
   * struct and the array of WorkerInfo structs.  This struct keeps:
   *
+  * av_signal		set by other processes to indicate various conditions
   * av_launcherpid	the PID of the autovacuum launcher
   * av_freeWorkers	the WorkerInfo freelist
   * av_runningWorkers the WorkerInfo non-free queue
   * av_startingWorker pointer to WorkerInfo currently being started (cleared by
   *					the worker itself as soon as it's up and running)
   *
!  * This struct is protected by AutovacuumLock, except for av_signal and parts
!  * of the worker list (see above).
   *-------------
   */
  typedef struct
  {
+ 	sig_atomic_t	av_signal[AutoVacNumSignals];
  	pid_t			av_launcherpid;
  	SHMEM_OFFSET	av_freeWorkers;
  	SHM_QUEUE		av_runningWorkers;
  	SHMEM_OFFSET	av_startingWorker;
  } AutoVacuumShmemStruct;
  
  static AutoVacuumShmemStruct *AutoVacuumShmem;
*************** StartAutoVacLauncher(void)
*** 316,344 ****
  
  /*
   * Main loop for the autovacuum launcher process.
-  *
-  * The signalling between launcher and worker is as follows:
-  *
-  * When the worker has finished starting up, it stores its PID in wi_workerpid
-  * and sends a SIGUSR1 signal to the launcher.  The launcher then knows that
-  * the postmaster is ready to start a new worker.  We do it this way because
-  * otherwise we risk calling SendPostmasterSignal() when the postmaster hasn't
-  * yet processed the last one, in which case the second signal would be lost.
-  * This is only useful when two workers need to be started close to one
-  * another, which should be rare but it's possible.
-  *
-  * When a worker exits, it resets the WorkerInfo struct and puts it back into
-  * the free list.  If there is no free worker slot, it will also signal the
-  * launcher, which then wakes up and can launch a new worker if it needs to.
-  * Note that we only need to do it when there's no free worker slot, because
-  * otherwise there is no need -- the launcher would be awakened normally per
-  * schedule.
-  *
-  * There is a potential problem if, for some reason, a worker starts and is not
-  * able to bootstrap itself correctly.  To prevent this situation from starving
-  * the whole system, the launcher checks the launch time of the "starting
-  * worker".  If it's too old (older than autovacuum_naptime seconds), it resets
-  * the worker entry and puts it back into the free list.
   */
  NON_EXEC_STATIC void
  AutoVacLauncherMain(int argc, char *argv[])
--- 364,369 ----
*************** AutoVacLauncherMain(int argc, char *argv
*** 547,552 ****
--- 572,581 ----
  			got_SIGHUP = false;
  			ProcessConfigFile(PGC_SIGHUP);
  
+ 			/* shutdown requested in config file */
+ 			if (!autovacuum_start_daemon)
+ 				break;
+ 
  			/* rebalance in case the default cost parameters changed */
  			LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
  			autovac_balance_cost();
*************** AutoVacLauncherMain(int argc, char *argv
*** 556,640 ****
  			rebuild_database_list(InvalidOid);
  		}
  
! 		/* a worker started up or finished */
  		if (got_SIGUSR1)
  		{
  			got_SIGUSR1 = false;
  
  			/* rebalance cost limits, if needed */
! 			if (AutoVacuumShmem->av_rebalance)
  			{
  				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
! 				AutoVacuumShmem->av_rebalance = false;
  				autovac_balance_cost();
  				LWLockRelease(AutovacuumLock);
  			}
- 		}
- 
- 		/*
- 		 * There are some conditions that we need to check before trying to
- 		 * start a launcher.  First, we need to make sure that there is a
- 		 * launcher slot available.  Second, we need to make sure that no other
- 		 * worker is still starting up.
- 		 */
- 
- 		LWLockAcquire(AutovacuumLock, LW_SHARED);
- 
- 		can_launch = (AutoVacuumShmem->av_freeWorkers != INVALID_OFFSET);
- 
- 		if (can_launch && AutoVacuumShmem->av_startingWorker != INVALID_OFFSET)
- 		{
- 			WorkerInfo worker = (WorkerInfo) MAKE_PTR(AutoVacuumShmem->av_startingWorker);
  
! 			if (current_time == 0)
! 				current_time = GetCurrentTimestamp();
! 
! 			/*
! 			 * We can't launch another worker when another one is still
! 			 * starting up, so just sleep for a bit more; that worker will wake
! 			 * us up again as soon as it's ready.  We will only wait
! 			 * autovacuum_naptime seconds for this to happen however.  Note
! 			 * that failure to connect to a particular database is not a
! 			 * problem here, because the worker removes itself from the
! 			 * startingWorker pointer before trying to connect; only low-level
! 			 * problems, like fork() failure, can get us here.
! 			 */
! 			if (TimestampDifferenceExceeds(worker->wi_launchtime, current_time,
! 										   autovacuum_naptime * 1000))
! 			{
! 				LWLockRelease(AutovacuumLock);
! 				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
! 				/*
! 				 * No other process can put a worker in starting mode, so if
! 				 * startingWorker is still INVALID after exchanging our lock,
! 				 * we assume it's the same one we saw above (so we don't
! 				 * recheck the launch time).
! 				 */
! 				if (AutoVacuumShmem->av_startingWorker != INVALID_OFFSET)
! 				{
! 					worker = (WorkerInfo) MAKE_PTR(AutoVacuumShmem->av_startingWorker);
! 					worker->wi_dboid = InvalidOid;
! 					worker->wi_tableoid = InvalidOid;
! 					worker->wi_workerpid = 0;
! 					worker->wi_launchtime = 0;
! 					worker->wi_links.next = AutoVacuumShmem->av_freeWorkers;
! 					AutoVacuumShmem->av_freeWorkers = MAKE_OFFSET(worker);
! 					AutoVacuumShmem->av_startingWorker = INVALID_OFFSET;
! 				}
! 			}
! 			else
  			{
  				/*
! 				 * maybe the postmaster neglected this start signal --
! 				 * resend it.  Note: the constraints in
! 				 * launcher_determine_sleep keep us from delivering signals too
! 				 * quickly (at most once every 100ms).
  				 */
  				SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
! 				can_launch = false;
  			}
  		}
! 		LWLockRelease(AutovacuumLock);		/* either shared or exclusive */
  
  		if (can_launch)
  		{
--- 585,633 ----
  			rebuild_database_list(InvalidOid);
  		}
  
! 		/* a worker finished */
  		if (got_SIGUSR1)
  		{
  			got_SIGUSR1 = false;
  
  			/* rebalance cost limits, if needed */
! 			if (AutoVacuumShmem->av_signal[AutoVacRebalance])
  			{
  				LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
! 				AutoVacuumShmem->av_signal[AutoVacRebalance] = false;
  				autovac_balance_cost();
  				LWLockRelease(AutovacuumLock);
  			}
  
! 			if (AutoVacuumShmem->av_signal[AutoVacForkFailed])
  			{
  				/*
! 				 * If the postmaster failed to start a new worker, we sleep
! 				 * for a little while and resend the signal.  The new worker's
! 				 * state is still in memory, so this is sufficient.  After
! 				 * that, we restart the main loop.
! 				 *
! 				 * XXX should we put a limit to the number of times we retry?
! 				 * I don't think it makes much sense, because a future start
! 				 * of a worker will continue to fail in the same way.
  				 */
+ 				AutoVacuumShmem->av_signal[AutoVacForkFailed] = false;
+ 				pg_usleep(100000L);	/* 100ms */
  				SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
! 				continue;
  			}
  		}
! 
! 		/*
! 		 * We cannot launch a worker if there are not free worker slots, or if
! 		 * there is already a worker starting up.
! 		 */
! 		can_launch = true;
! 		LWLockAcquire(AutovacuumLock, LW_SHARED);
! 		if (AutoVacuumShmem->av_freeWorkers == INVALID_OFFSET ||
! 			AutoVacuumShmem->av_startingWorker != INVALID_OFFSET)
! 			can_launch = false;
! 		LWLockRelease(AutovacuumLock);
  
  		if (can_launch)
  		{
*************** launch_worker(TimestampTz now)
*** 1197,1202 ****
--- 1190,1206 ----
  	}
  }
  
+ /*
+  * Called from postmaster to signal a failure to fork a process to become
+  * worker.  The postmaster should kill(SIGUSR1) the launcher shortly
+  * after calling this function.
+  */
+ void
+ AutoVacWorkerFailed(void)
+ {
+ 	AutoVacuumShmem->av_signal[AutoVacForkFailed] = true;
+ }
+ 
  /* SIGHUP: set flag to re-read config file at next convenient time */
  static void
  avl_sighup_handler(SIGNAL_ARGS)
*************** AutoVacWorkerMain(int argc, char *argv[]
*** 1435,1442 ****
  	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
  
  	/*
! 	 * beware of startingWorker being INVALID; this could happen if the
! 	 * launcher thinks we've taking too long to start.
  	 */
  	if (AutoVacuumShmem->av_startingWorker != INVALID_OFFSET)
  	{
--- 1439,1446 ----
  	LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
  
  	/*
! 	 * beware of startingWorker being INVALID; this should not happen but keep
! 	 * the check for safety
  	 */
  	if (AutoVacuumShmem->av_startingWorker != INVALID_OFFSET)
  	{
*************** AutoVacWorkerMain(int argc, char *argv[]
*** 1463,1468 ****
--- 1467,1473 ----
  	else
  	{
  		/* no worker entry for me, go away */
+ 		elog(WARNING, "autovacuum worker started without a worker entry");
  		dbid = InvalidOid;
  		LWLockRelease(AutovacuumLock);
  	}
*************** FreeWorkerInfo(int code, Datum arg)
*** 1525,1543 ****
  		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
  
  		/*
! 		 * If this worker shuts down when there is no free worker slot, wake
! 		 * the launcher up so that he can launch a new worker immediately if
! 		 * required.  We only save the launcher's PID in local memory here --
! 		 * the actual signal will be sent when the PGPROC is recycled, because
! 		 * that is when the new worker can actually be launched.
  		 *
  		 * We somewhat ignore the risk that the launcher changes its PID
  		 * between we reading it and the actual kill; we expect ProcKill to be
  		 * called shortly after us, and we assume that PIDs are not reused too
  		 * quickly after a process exits.
  		 */
! 		if (AutoVacuumShmem->av_freeWorkers == INVALID_OFFSET)
! 			AutovacuumLauncherPid = AutoVacuumShmem->av_launcherpid;
  
  		SHMQueueDelete(&MyWorkerInfo->wi_links);
  		MyWorkerInfo->wi_links.next = AutoVacuumShmem->av_freeWorkers;
--- 1530,1547 ----
  		LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE);
  
  		/*
! 		 * Wake the launcher up so that he can launch a new worker immediately
! 		 * if required.  We only save the launcher's PID in local memory here;
! 		 * the actual signal will be sent when the PGPROC is recycled.  Note
! 		 * that we always to this, so that the launcher can rebalance the cost
! 		 * limit setting of the remaining workers.
  		 *
  		 * We somewhat ignore the risk that the launcher changes its PID
  		 * between we reading it and the actual kill; we expect ProcKill to be
  		 * called shortly after us, and we assume that PIDs are not reused too
  		 * quickly after a process exits.
  		 */
! 		AutovacuumLauncherPid = AutoVacuumShmem->av_launcherpid;
  
  		SHMQueueDelete(&MyWorkerInfo->wi_links);
  		MyWorkerInfo->wi_links.next = AutoVacuumShmem->av_freeWorkers;
*************** FreeWorkerInfo(int code, Datum arg)
*** 1556,1562 ****
  		 * now that we're inactive, cause a rebalancing of the surviving
  		 * workers
  		 */
! 		AutoVacuumShmem->av_rebalance = true;
  		LWLockRelease(AutovacuumLock);
  	}
  }
--- 1560,1566 ----
  		 * now that we're inactive, cause a rebalancing of the surviving
  		 * workers
  		 */
! 		AutoVacuumShmem->av_signal[AutoVacRebalance] = true;
  		LWLockRelease(AutovacuumLock);
  	}
  }
Index: src/backend/postmaster/postmaster.c
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.527
diff -c -p -r1.527 postmaster.c
*** src/backend/postmaster/postmaster.c	22 Mar 2007 19:53:30 -0000	1.527
--- src/backend/postmaster/postmaster.c	18 Jun 2007 21:00:56 -0000
*************** StartAutovacuumWorker(void)
*** 3830,3861 ****
  		return;
  
  	bn = (Backend *) malloc(sizeof(Backend));
! 	if (!bn)
  	{
! 		ereport(LOG,
! 				(errcode(ERRCODE_OUT_OF_MEMORY),
! 				 errmsg("out of memory")));
! 		return;
! 	}
! 
! 	bn->pid = StartAutoVacWorker();
! 	bn->is_autovacuum = true;
! 	/* we don't need a cancel key */
  
! 	if (bn->pid > 0)
! 	{
! 		DLAddHead(BackendList, DLNewElem(bn));
  #ifdef EXEC_BACKEND
! 		ShmemBackendArrayAdd(bn);
  #endif
! 	}
! 	else
! 	{
! 		/* not much we can do */
! 		ereport(LOG,
! 				(errmsg("could not fork new process for autovacuum: %m")));
  		free(bn);
  	}
  }
  
  /*
--- 3830,3863 ----
  		return;
  
  	bn = (Backend *) malloc(sizeof(Backend));
! 	if (bn)
  	{
! 		bn->pid = StartAutoVacWorker();
! 		bn->is_autovacuum = true;
! 		/* we don't need a cancel key */
  
! 		if (bn->pid > 0)
! 		{
! 			/* FIXME -- unchecked memory allocation here */
! 			DLAddHead(BackendList, DLNewElem(bn));
  #ifdef EXEC_BACKEND
! 			ShmemBackendArrayAdd(bn);
  #endif
! 			/* all OK */
! 			return;
! 		}
! 
! 		/* failed, fall through to report */
! 		/* XXX the reason of the failure is not reported??? */
  		free(bn);
  	}
+ 	else
+ 		elog(LOG, "out of memory");
+ 
+ 	/* report the failure to the launcher */
+ 	AutoVacWorkerFailed();
+ 	if (AutoVacPID != 0)
+ 		kill(AutoVacPID, SIGUSR1);
  }
  
  /*
Index: src/include/postmaster/autovacuum.h
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/src/include/postmaster/autovacuum.h,v
retrieving revision 1.10
diff -c -p -r1.10 autovacuum.h
*** src/include/postmaster/autovacuum.h	18 Apr 2007 16:44:18 -0000	1.10
--- src/include/postmaster/autovacuum.h	18 Jun 2007 18:37:30 -0000
*************** extern bool IsAutoVacuumWorkerProcess(vo
*** 42,47 ****
--- 42,49 ----
  extern void autovac_init(void);
  extern int	StartAutoVacLauncher(void);
  extern int	StartAutoVacWorker(void);
+ /* called from postmaster when a worker could not be forked */
+ extern void AutoVacWorkerFailed(void);
  
  /* autovacuum cost-delay balancer */
  extern void AutoVacuumUpdateDelay(void);
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Reply via email to