Re: [HACKERS] [PATCHES] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-13 Thread Alvaro Herrera
Alvaro Herrera wrote:

> > Ah; yes, what I was proposing (or thought about proposing, not sure if I
> > posted it or not) was putting a upper limit of 10 seconds in the sleep
> > (bgwriter sleeps 10 seconds if configured to not do anything).  Though
> > 10 seconds may seem like an eternity for systems like the ones Peter was
> > talking about, where there is a script trying to restart the server as
> > soon as the postmaster dies.

Peter, is 10 seconds good enough for you?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Autovacuum launcher doesn't notice death of postmaster immediately

2007-06-13 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
> 
> Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> 
> > > No, I meant a "while (sleep 1(or 10) and counter < longtime) check for
> > > exit" instead of "sleep longtime".
> > 
> > Ah; yes, what I was proposing (or thought about proposing, not sure if I
> > posted it or not) was putting a upper limit of 10 seconds in the sleep
> > (bgwriter sleeps 10 seconds if configured to not do anything).  Though
> > 10 seconds may seem like an eternity for systems like the ones Peter was
> > talking about, where there is a script trying to restart the server as
> > soon as the postmaster dies.
> 
> Here is a patch for split-sleep of autovacuum_naptime.
> 
> There are some other issues in CVS HEAD; We use the calculation
> {autovacuum_naptime * 100} in launcher_determine_sleep().
> The result will be corrupted if we set autovacuum_naptime to >2147.

Ugh.  How about this patch; this avoids the overflow issue altogether.
I am not sure that this works on Win32 but it seems we are already using
struct timeval elsewhere, so I don't see why it wouldn't work.


> In another place, we use {autovacuum_naptime * 1000}, so we should
> set the upper bound to INT_MAX/1000 instead of INT_MAX.
> Incidentally, we've already had the same protections for 
> log_min_duration_statement and log_autovacuum.

Hmm, yes, the naptime should have an upper bound of INT_MAX/1000.  It
doesn't seem worth the trouble of changing those places, when we know
that such a high value of naptime is uselessly high.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.49
diff -c -p -r1.49 autovacuum.c
*** src/backend/postmaster/autovacuum.c	8 Jun 2007 21:21:28 -	1.49
--- src/backend/postmaster/autovacuum.c	13 Jun 2007 17:27:39 -
***
*** 18,23 
--- 18,24 
  
  #include 
  #include 
+ #include 
  #include 
  #include 
  
*** int			autovacuum_vac_cost_limit;
*** 73,78 
--- 74,83 
  
  int			Log_autovacuum = -1;
  
+ 
+ /* maximum sleep duration in the launcher */
+ #define AV_SLEEP_QUANTUM 10
+ 
  /* Flags to tell if we are in an autovacuum process */
  static bool am_autovacuum_launcher = false;
  static bool am_autovacuum_worker = false;
*** NON_EXEC_STATIC void AutoVacWorkerMain(i
*** 197,203 
  NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
  
  static Oid do_start_worker(void);
! static uint64 launcher_determine_sleep(bool canlaunch, bool recursing);
  static void launch_worker(TimestampTz now);
  static List *get_database_list(void);
  static void rebuild_database_list(Oid newdb);
--- 202,209 
  NON_EXEC_STATIC void AutoVacLauncherMain(int argc, char *argv[]);
  
  static Oid do_start_worker(void);
! static void launcher_determine_sleep(bool canlaunch, bool recursing,
! 		 struct timeval *nap);
  static void launch_worker(TimestampTz now);
  static List *get_database_list(void);
  static void rebuild_database_list(Oid newdb);
*** AutoVacLauncherMain(int argc, char *argv
*** 487,493 
  
  	for (;;)
  	{
! 		uint64		micros;
  		bool	can_launch;
  		TimestampTz current_time = 0;
  
--- 493,499 
  
  	for (;;)
  	{
! 		struct timeval nap;
  		bool	can_launch;
  		TimestampTz current_time = 0;
  
*** AutoVacLauncherMain(int argc, char *argv
*** 498,508 
  		if (!PostmasterIsAlive(true))
  			exit(1);
  
! 		micros = launcher_determine_sleep(AutoVacuumShmem->av_freeWorkers !=
! 		  INVALID_OFFSET, false);
  
! 		/* Sleep for a while according to schedule */
! 		pg_usleep(micros);
  
  		/* the normal shutdown case */
  		if (avlauncher_shutdown_request)
--- 504,542 
  		if (!PostmasterIsAlive(true))
  			exit(1);
  
! 		launcher_determine_sleep(AutoVacuumShmem->av_freeWorkers !=
!    INVALID_OFFSET, false, &nap);
! 
! 		/*
! 		 * Sleep for a while according to schedule.  We only sleep in
! 		 * AV_SLEEP_QUANTUM sec intervals, in order to promptly notice
! 		 * postmaster death.
! 		 */
! 		while (nap.tv_sec > 0 || nap.tv_usec > 0)
! 		{
! 			uint32	sleeptime;
! 
! 			sleeptime = nap.tv_usec;
! 			nap.tv_usec = 0;
  
! 			if (nap.tv_sec > 0)
! 			{
! sleeptime += Min(nap.tv_sec, AV_SLEEP_QUANTUM) * 100;
! nap.tv_sec -= Min(nap.tv_sec, AV_SLEEP_QUANTUM);
! 			}
! 			
! 			pg_usleep(sleeptime);
! 
! 			/*
! 			 * Emergency bailout if postmaster has died.  This is to avoid the
! 			 * necessity for manual cleanup of all postmaster children.
! 			 */
! 			if (!PostmasterIsAlive(true))
! exit(1);
! 
! 			if (avlauncher_shutdown_request || got_SIGHUP || got_SIGUSR1)
! break;
! 		}
  
  		/* the normal shutdown case */
  		if (avlauncher_shutdown_request)
*** AutoVacLauncherMain(int argc, ch