Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-24 Thread Alvaro Herrera
Darcy Buskermolen wrote:
> On September 23, 2007 09:12 pm, Alvaro Herrera wrote:
> > Darcy Buskermolen wrote:
> > > On September 14, 2007 06:36 am, Alvaro Herrera wrote:
> > > > Darcy, please also apply the following patch and see if it reduces the
> > > > CPU consumption to a reasonable level.
> > >
> > > This is looking much better now too, it's brought the AVL down to near 0%
> > > CPU usage.
> >
> > Thanks, applied.  I still feel CPU usage is somewhat excessive but I
> > don't think there's much to be done about it.  Maybe I'm just testing
> > with too many databases.
> 
> My findings were against  83 DB's

I was testing with 300 and naptime=10s.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-24 Thread Darcy Buskermolen
On September 23, 2007 09:12 pm, Alvaro Herrera wrote:
> Darcy Buskermolen wrote:
> > On September 14, 2007 06:36 am, Alvaro Herrera wrote:
> > > Darcy, please also apply the following patch and see if it reduces the
> > > CPU consumption to a reasonable level.
> >
> > This is looking much better now too, it's brought the AVL down to near 0%
> > CPU usage.
>
> Thanks, applied.  I still feel CPU usage is somewhat excessive but I
> don't think there's much to be done about it.  Maybe I'm just testing
> with too many databases.

My findings were against  83 DB's


-- 
Darcy Buskermolen
Command Prompt, Inc.
+1.503.667.4564 X 102
http://www.commandprompt.com/
PostgreSQL solutions since 1997

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-23 Thread Alvaro Herrera
Darcy Buskermolen wrote:
> On September 14, 2007 06:36 am, Alvaro Herrera wrote:

> > Darcy, please also apply the following patch and see if it reduces the
> > CPU consumption to a reasonable level.
> 
> This is looking much better now too, it's brought the AVL down to near 0% CPU 
> usage.  

Thanks, applied.  I still feel CPU usage is somewhat excessive but I
don't think there's much to be done about it.  Maybe I'm just testing
with too many databases.

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
"Saca el libro que tu religión considere como el indicado para encontrar la
oración que traiga paz a tu alma. Luego rebootea el computador
y ve si funciona" (Carlos Duclós)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-14 Thread Darcy Buskermolen
On September 14, 2007 06:36 am, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Darcy Buskermolen wrote:
> > > On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote:
> > > > Hi,
> > > >
> > > > Darcy Buskermolen noticed that when one has many databases, the
> > > > autovac launcher starts eating too much CPU.
> > >
> > > Don't forget the memory leak as well.  after 3 or 4 days of running I
> > > end up with a 2GB+ AVL..
> >
> > Huh, sorry for not letting you know, I already fixed that :-)  (Please
> > grab the latest CVS HEAD and confirm.)
Ok that looks much better, after running it for 8ish hours I'm not seeing any 
of the previous footprint growth.

>
> Darcy, please also apply the following patch and see if it reduces the
> CPU consumption to a reasonable level.

This is looking much better now too, it's brought the AVL down to near 0% CPU 
usage.  


>
> What this patch does is keep the pgstats data for 1 second in the
> autovac launcher.  The idea is to avoid reading the data too frequently.
> I coded it so that it doesn't affect the worker, because it would make
> the table recheck code less effective.

-- 
Darcy Buskermolen
Command Prompt, Inc.
+1.503.667.4564 X 102
http://www.commandprompt.com/
PostgreSQL solutions since 1997

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-14 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Darcy Buskermolen wrote:
> > On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote:
> > > Hi,
> > >
> > > Darcy Buskermolen noticed that when one has many databases, the autovac
> > > launcher starts eating too much CPU.
> > 
> > Don't forget the memory leak as well.  after 3 or 4 days of running I end 
> > up 
> > with a 2GB+ AVL..
> 
> Huh, sorry for not letting you know, I already fixed that :-)  (Please
> grab the latest CVS HEAD and confirm.)

Darcy, please also apply the following patch and see if it reduces the
CPU consumption to a reasonable level.

What this patch does is keep the pgstats data for 1 second in the
autovac launcher.  The idea is to avoid reading the data too frequently.
I coded it so that it doesn't affect the worker, because it would make
the table recheck code less effective.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"MySQL is a toy compared to PostgreSQL." (Randal L. Schwartz)
  (http://archives.postgresql.org/pgsql-general/2005-07/msg00517.php)
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.58
diff -c -p -r1.58 autovacuum.c
*** src/backend/postmaster/autovacuum.c	12 Sep 2007 22:14:59 -	1.58
--- src/backend/postmaster/autovacuum.c	13 Sep 2007 22:13:29 -
*** int			autovacuum_vac_cost_limit;
*** 116,121 
--- 116,124 
  
  int			Log_autovacuum = -1;
  
+ /* how long to keep pgstat data in the launcher, in milliseconds */
+ #define AUTOVAC_STATS_CACHE 1000
+ 
  
  /* Flags to tell if we are in an autovacuum process */
  static bool am_autovacuum_launcher = false;
*** static void avl_sighup_handler(SIGNAL_AR
*** 291,296 
--- 294,300 
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
  static void avl_quickdie(SIGNAL_ARGS);
+ static void autovac_refresh_stats(void);
  
  
  
*** AutoVacLauncherMain(int argc, char *argv
*** 488,494 
  		DatabaseListCxt = NULL;
  		DatabaseList = NULL;
  
! 		/* Make sure pgstat also considers our stat data as gone */
  		pgstat_clear_snapshot();
  
  		/* Now we can allow interrupts again */
--- 492,501 
  		DatabaseListCxt = NULL;
  		DatabaseList = NULL;
  
! 		/*
! 		 * Make sure pgstat also considers our stat data as gone.  Note: we
! 		 * mustn't use autovac_refresh_stats here.
! 		 */
  		pgstat_clear_snapshot();
  
  		/* Now we can allow interrupts again */
*** rebuild_database_list(Oid newdb)
*** 836,842 
  	HTAB	   *dbhash;
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	newcxt = AllocSetContextCreate(AutovacMemCxt,
     "AV dblist",
--- 843,849 
  	HTAB	   *dbhash;
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	newcxt = AllocSetContextCreate(AutovacMemCxt,
     "AV dblist",
*** do_start_worker(void)
*** 1063,1069 
  	oldcxt = MemoryContextSwitchTo(tmpcxt);
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	/* Get a list of databases */
  	dblist = get_database_list();
--- 1070,1076 
  	oldcxt = MemoryContextSwitchTo(tmpcxt);
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	/* Get a list of databases */
  	dblist = get_database_list();
*** do_start_worker(void)
*** 1106,1114 
  		avw_dbase  *tmp = lfirst(cell);
  		Dlelem	   *elem;
  
- 		/* Find pgstat entry if any */
- 		tmp->adw_entry = pgstat_fetch_stat_dbentry(tmp->adw_datid);
- 
  		/* Check to see if this one is at risk of wraparound */
  		if (TransactionIdPrecedes(tmp->adw_frozenxid, xidForceLimit))
  		{
--- 1113,1118 
*** do_start_worker(void)
*** 1121,1129 
  		else if (for_xid_wrap)
  			continue;			/* ignore not-at-risk DBs */
  
  		/*
! 		 * Otherwise, skip a database with no pgstat entry; it means it
! 		 * hasn't seen any activity.
  		 */
  		if (!tmp->adw_entry)
  			continue;
--- 1125,1136 
  		else if (for_xid_wrap)
  			continue;			/* ignore not-at-risk DBs */
  
+ 		/* Find pgstat entry if any */
+ 		tmp->adw_entry = pgstat_fetch_stat_dbentry(tmp->adw_datid);
+ 
  		/*
! 		 * Skip a database with no pgstat entry; it means it hasn't seen any
! 		 * activity.
  		 */
  		if (!tmp->adw_entry)
  			continue;
*** table_recheck_autovac(Oid relid)
*** 2258,2264 
  	PgStat_StatDBEntry *dbentry;
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	shared = pgstat_fetch_stat_dbentry(InvalidOid);
  	dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
--- 2265,2271 
  	PgStat_StatDBEntry *dbentry;
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	shared = pgstat_fetch_stat_dbentry(InvalidOid);
  	dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
*** AutoVacuumShmemInit(void)
*** 2734,2736 
--- 2741,2769 
  	else
  		A

Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-13 Thread Alvaro Herrera
Darcy Buskermolen wrote:
> On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote:
> > Hi,
> >
> > Darcy Buskermolen noticed that when one has many databases, the autovac
> > launcher starts eating too much CPU.
> 
> Don't forget the memory leak as well.  after 3 or 4 days of running I end up 
> with a 2GB+ AVL..

Huh, sorry for not letting you know, I already fixed that :-)  (Please
grab the latest CVS HEAD and confirm.)

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-13 Thread Darcy Buskermolen
On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote:
> Hi,
>
> Darcy Buskermolen noticed that when one has many databases, the autovac
> launcher starts eating too much CPU.

Don't forget the memory leak as well.  after 3 or 4 days of running I end up 
with a 2GB+ AVL..

>
> I tried it here with 200 databases and indeed it does seem to eat its
> share.  Even with the default naptime, which I wouldn't have thought
> that was too high (it does make the launcher wake up about three times a
> second though).
>
> I'm looking at a profile and I can't seem to make much sense out of it.
> It seems to me like the problem is not autovac itself, but rather the
> pgstat code that reads the stat file from disk.  Of course, autovac does
> need to read the file fairly regularly.
>
> Here is the top lines of gprof output.
>
> Comments?  Is there something here that needs fixing?
-- 


Darcy Buskermolen
The PostgreSQL company, Command Prompt Inc.
http://www.commandprompt.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] autovacuum launcher eating too much CPU

2007-09-13 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Darcy Buskermolen noticed that when one has many databases, the autovac
> launcher starts eating too much CPU.
> 
> I tried it here with 200 databases and indeed it does seem to eat its
> share.  Even with the default naptime, which I wouldn't have thought
> that was too high (it does make the launcher wake up about three times a
> second though).

This patch does not solve the whole problem but it alleviates it a bit
by throttling pgstat reads.  One problem with it is that the interval
for this increases:

/*
 * Check whether pgstat data still says we need to vacuum this table.
 * It could have changed if something else processed the table while we
 * weren't looking.
 *
 * FIXME we ignore the possibility that the table was finished being
 * vacuumed in the last 500ms (PGSTAT_STAT_INTERVAL).  This is a bug.
 */
MemoryContextSwitchTo(AutovacMemCxt);
tab = table_recheck_autovac(relid);

which could be a problem in itself, by causing unnecessary vacuums.

Opinions?

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"In Europe they call me Niklaus Wirth; in the US they call me Nickel's worth.
 That's because in Europe they call me by name, and in the US by value!"
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.58
diff -c -p -r1.58 autovacuum.c
*** src/backend/postmaster/autovacuum.c	12 Sep 2007 22:14:59 -	1.58
--- src/backend/postmaster/autovacuum.c	13 Sep 2007 16:57:59 -
*** static void avl_sighup_handler(SIGNAL_AR
*** 291,296 
--- 291,297 
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
  static void avl_quickdie(SIGNAL_ARGS);
+ static void autovac_refresh_stats(void);
  
  
  
*** AutoVacLauncherMain(int argc, char *argv
*** 488,494 
  		DatabaseListCxt = NULL;
  		DatabaseList = NULL;
  
! 		/* Make sure pgstat also considers our stat data as gone */
  		pgstat_clear_snapshot();
  
  		/* Now we can allow interrupts again */
--- 489,498 
  		DatabaseListCxt = NULL;
  		DatabaseList = NULL;
  
! 		/*
! 		 * Make sure pgstat also considers our stat data as gone.  Note: we
! 		 * musn't use autovac_refresh_stats here.
! 		 */
  		pgstat_clear_snapshot();
  
  		/* Now we can allow interrupts again */
*** rebuild_database_list(Oid newdb)
*** 836,842 
  	HTAB	   *dbhash;
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	newcxt = AllocSetContextCreate(AutovacMemCxt,
     "AV dblist",
--- 840,846 
  	HTAB	   *dbhash;
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	newcxt = AllocSetContextCreate(AutovacMemCxt,
     "AV dblist",
*** do_start_worker(void)
*** 1063,1069 
  	oldcxt = MemoryContextSwitchTo(tmpcxt);
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	/* Get a list of databases */
  	dblist = get_database_list();
--- 1067,1073 
  	oldcxt = MemoryContextSwitchTo(tmpcxt);
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	/* Get a list of databases */
  	dblist = get_database_list();
*** table_recheck_autovac(Oid relid)
*** 2258,2264 
  	PgStat_StatDBEntry *dbentry;
  
  	/* use fresh stats */
! 	pgstat_clear_snapshot();
  
  	shared = pgstat_fetch_stat_dbentry(InvalidOid);
  	dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
--- 2262,2268 
  	PgStat_StatDBEntry *dbentry;
  
  	/* use fresh stats */
! 	autovac_refresh_stats();
  
  	shared = pgstat_fetch_stat_dbentry(InvalidOid);
  	dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
*** AutoVacuumShmemInit(void)
*** 2734,2736 
--- 2738,2759 
  	else
  		Assert(found);
  }
+ 
+ /*
+  * Refresh pgstats data in an autovacuum process, at most every 500 ms.  This
+  * is to avoid rereading the pgstats files too many times in quick succession.
+  */
+ static void
+ autovac_refresh_stats(void)
+ {
+ 	static TimestampTz last_read = 0;
+ 	TimestampTz current_time;
+ 
+ 	current_time = GetCurrentTimestamp();
+ 
+ 	if (!TimestampDifferenceExceeds(last_read, current_time, 500))
+ 		return;
+ 
+ 	pgstat_clear_snapshot();
+ 	last_read = current_time;
+ }

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] autovacuum launcher eating too much CPU

2007-09-13 Thread Alvaro Herrera
Hi,

Darcy Buskermolen noticed that when one has many databases, the autovac
launcher starts eating too much CPU.

I tried it here with 200 databases and indeed it does seem to eat its
share.  Even with the default naptime, which I wouldn't have thought
that was too high (it does make the launcher wake up about three times a
second though).

I'm looking at a profile and I can't seem to make much sense out of it.
It seems to me like the problem is not autovac itself, but rather the
pgstat code that reads the stat file from disk.  Of course, autovac does
need to read the file fairly regularly.

Here is the top lines of gprof output.

Comments?  Is there something here that needs fixing?

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
"Investigación es lo que hago cuando no sé lo que estoy haciendo"
(Wernher von Braun)
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total   
 time   seconds   secondscalls  ms/call  ms/call  name
 40.00  0.02 0.02   113360 0.00 0.00  
hash_search_with_hash_value
 20.00  0.03 0.0172816 0.00 0.00  AllocSetFreeIndex
 20.00  0.04 0.0155517 0.00 0.00  AllocSetAlloc
 20.00  0.05 0.01 4247 0.00 0.00  choose_nelem_alloc
  0.00  0.05 0.00   396779 0.00 0.00  pg_isblank
  0.00  0.05 0.00   119276 0.00 0.00  calc_bucket
  0.00  0.05 0.00   113360 0.00 0.00  hash_search
  0.00  0.05 0.00   113330 0.00 0.00  hash_uint32
  0.00  0.05 0.00   113330 0.00 0.00  oid_hash
  0.00  0.05 0.00   105035 0.00 0.00  get_hash_entry
  0.00  0.05 0.0055514 0.00 0.00  MemoryContextAlloc
  0.00  0.05 0.0041798 0.00 0.00  next_token
  0.00  0.05 0.0025512 0.00 0.00  DynaHashAlloc
  0.00  0.05 0.0016187 0.00 0.00  TimestampDifferenceExceeds
  0.00  0.05 0.00 8742 0.00 0.00  MemoryContextStrdup
  0.00  0.05 0.00 8511 0.00 0.00  seg_alloc
  0.00  0.05 0.00 8507 0.00 0.00  my_log2
  0.00  0.05 0.00 8379 0.00 0.00  read_pg_database_line
  0.00  0.05 0.00 8364 0.00 0.00  check_list_invariants
  0.00  0.05 0.00 8353 0.00 0.00  lappend
  0.00  0.05 0.00 8322 0.00 0.00  backend_read_statsfile
  0.00  0.05 0.00 8322 0.00 0.00  pgstat_fetch_stat_dbentry



index % timeself  childrencalled name
 
[1] 99.60.000.05 reaper [1]
0.000.05   1/1   StartAutoVacLauncher [3]
0.000.00   1/1   load_role [51]
0.000.00   2/3   StartChildProcess [238]
0.000.00   1/60  errstart [179]
0.000.00   1/1   AutoVacuumingActive [289]
---
0.000.05   1/1   StartAutoVacLauncher [3]
[2] 99.50.000.05   1 AutoVacLauncherMain [2]
0.000.05  56/56  launch_worker [5]
0.000.00   1/1   rebuild_database_list [27]
0.000.00   2/2   errmsg [78]
0.000.00   2/2   errfinish [101]
0.000.00   1/1   BaseInit [115]
0.000.00   1/4330AllocSetContextCreate [24]
0.000.00 118/118 PostmasterIsAlive [165]
0.000.00  59/59  launcher_determine_sleep [181]
0.000.00  59/59  pg_usleep [182]
0.000.00  59/254 LWLockAcquire [157]
0.000.00  59/254 LWLockRelease [158]
0.000.00  58/288 GetCurrentTimestamp [156]
0.000.00  58/16187   TimestampDifferenceExceeds 
[135]
0.000.00  10/22  pqsignal [198]
0.000.00   2/60  errstart [179]
0.000.00   1/1   init_ps_display [383]
0.000.00   1/1   InitAuxiliaryProcess [308]
0.000.00   1/116 MemoryContextSwitchTo [166]
0.000.00   1/1   autovac_balance_cost [368]
0.000.00   1/1   proc_exit [403]
---
0.000.05   1/1   reaper [1]
[3] 99.50.000.05   1 StartAutoVacLauncher [3]
0.000.05   1/1   AutoVacLauncherMain [2]
0.000.00   1/4