Re: [HACKERS] autovacuum launcher eating too much CPU
Darcy Buskermolen wrote: > On September 23, 2007 09:12 pm, Alvaro Herrera wrote: > > Darcy Buskermolen wrote: > > > On September 14, 2007 06:36 am, Alvaro Herrera wrote: > > > > Darcy, please also apply the following patch and see if it reduces the > > > > CPU consumption to a reasonable level. > > > > > > This is looking much better now too, it's brought the AVL down to near 0% > > > CPU usage. > > > > Thanks, applied. I still feel CPU usage is somewhat excessive but I > > don't think there's much to be done about it. Maybe I'm just testing > > with too many databases. > > My findings were against 83 DB's I was testing with 300 and naptime=10s. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum launcher eating too much CPU
On September 23, 2007 09:12 pm, Alvaro Herrera wrote: > Darcy Buskermolen wrote: > > On September 14, 2007 06:36 am, Alvaro Herrera wrote: > > > Darcy, please also apply the following patch and see if it reduces the > > > CPU consumption to a reasonable level. > > > > This is looking much better now too, it's brought the AVL down to near 0% > > CPU usage. > > Thanks, applied. I still feel CPU usage is somewhat excessive but I > don't think there's much to be done about it. Maybe I'm just testing > with too many databases. My findings were against 83 DB's -- Darcy Buskermolen Command Prompt, Inc. +1.503.667.4564 X 102 http://www.commandprompt.com/ PostgreSQL solutions since 1997 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] autovacuum launcher eating too much CPU
Darcy Buskermolen wrote: > On September 14, 2007 06:36 am, Alvaro Herrera wrote: > > Darcy, please also apply the following patch and see if it reduces the > > CPU consumption to a reasonable level. > > This is looking much better now too, it's brought the AVL down to near 0% CPU > usage. Thanks, applied. I still feel CPU usage is somewhat excessive but I don't think there's much to be done about it. Maybe I'm just testing with too many databases. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "Saca el libro que tu religión considere como el indicado para encontrar la oración que traiga paz a tu alma. Luego rebootea el computador y ve si funciona" (Carlos Duclós) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum launcher eating too much CPU
On September 14, 2007 06:36 am, Alvaro Herrera wrote: > Alvaro Herrera wrote: > > Darcy Buskermolen wrote: > > > On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: > > > > Hi, > > > > > > > > Darcy Buskermolen noticed that when one has many databases, the > > > > autovac launcher starts eating too much CPU. > > > > > > Don't forget the memory leak as well. after 3 or 4 days of running I > > > end up with a 2GB+ AVL.. > > > > Huh, sorry for not letting you know, I already fixed that :-) (Please > > grab the latest CVS HEAD and confirm.) Ok that looks much better, after running it for 8ish hours I'm not seeing any of the previous footprint growth. > > Darcy, please also apply the following patch and see if it reduces the > CPU consumption to a reasonable level. This is looking much better now too, it's brought the AVL down to near 0% CPU usage. > > What this patch does is keep the pgstats data for 1 second in the > autovac launcher. The idea is to avoid reading the data too frequently. > I coded it so that it doesn't affect the worker, because it would make > the table recheck code less effective. -- Darcy Buskermolen Command Prompt, Inc. +1.503.667.4564 X 102 http://www.commandprompt.com/ PostgreSQL solutions since 1997 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum launcher eating too much CPU
Alvaro Herrera wrote: > Darcy Buskermolen wrote: > > On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: > > > Hi, > > > > > > Darcy Buskermolen noticed that when one has many databases, the autovac > > > launcher starts eating too much CPU. > > > > Don't forget the memory leak as well. after 3 or 4 days of running I end > > up > > with a 2GB+ AVL.. > > Huh, sorry for not letting you know, I already fixed that :-) (Please > grab the latest CVS HEAD and confirm.) Darcy, please also apply the following patch and see if it reduces the CPU consumption to a reasonable level. What this patch does is keep the pgstats data for 1 second in the autovac launcher. The idea is to avoid reading the data too frequently. I coded it so that it doesn't affect the worker, because it would make the table recheck code less effective. -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 "MySQL is a toy compared to PostgreSQL." (Randal L. Schwartz) (http://archives.postgresql.org/pgsql-general/2005-07/msg00517.php) Index: src/backend/postmaster/autovacuum.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.58 diff -c -p -r1.58 autovacuum.c *** src/backend/postmaster/autovacuum.c 12 Sep 2007 22:14:59 - 1.58 --- src/backend/postmaster/autovacuum.c 13 Sep 2007 22:13:29 - *** int autovacuum_vac_cost_limit; *** 116,121 --- 116,124 int Log_autovacuum = -1; + /* how long to keep pgstat data in the launcher, in milliseconds */ + #define AUTOVAC_STATS_CACHE 1000 + /* Flags to tell if we are in an autovacuum process */ static bool am_autovacuum_launcher = false; *** static void avl_sighup_handler(SIGNAL_AR *** 291,296 --- 294,300 static void avl_sigusr1_handler(SIGNAL_ARGS); static void avl_sigterm_handler(SIGNAL_ARGS); static void avl_quickdie(SIGNAL_ARGS); + static void autovac_refresh_stats(void); *** AutoVacLauncherMain(int argc, char *argv *** 488,494 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* Make sure pgstat also considers our stat data as gone */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ --- 492,501 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* ! * Make sure pgstat also considers our stat data as gone. Note: we ! * mustn't use autovac_refresh_stats here. ! */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ *** rebuild_database_list(Oid newdb) *** 836,842 HTAB *dbhash; /* use fresh stats */ ! pgstat_clear_snapshot(); newcxt = AllocSetContextCreate(AutovacMemCxt, "AV dblist", --- 843,849 HTAB *dbhash; /* use fresh stats */ ! autovac_refresh_stats(); newcxt = AllocSetContextCreate(AutovacMemCxt, "AV dblist", *** do_start_worker(void) *** 1063,1069 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! pgstat_clear_snapshot(); /* Get a list of databases */ dblist = get_database_list(); --- 1070,1076 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! autovac_refresh_stats(); /* Get a list of databases */ dblist = get_database_list(); *** do_start_worker(void) *** 1106,1114 avw_dbase *tmp = lfirst(cell); Dlelem *elem; - /* Find pgstat entry if any */ - tmp->adw_entry = pgstat_fetch_stat_dbentry(tmp->adw_datid); - /* Check to see if this one is at risk of wraparound */ if (TransactionIdPrecedes(tmp->adw_frozenxid, xidForceLimit)) { --- 1113,1118 *** do_start_worker(void) *** 1121,1129 else if (for_xid_wrap) continue; /* ignore not-at-risk DBs */ /* ! * Otherwise, skip a database with no pgstat entry; it means it ! * hasn't seen any activity. */ if (!tmp->adw_entry) continue; --- 1125,1136 else if (for_xid_wrap) continue; /* ignore not-at-risk DBs */ + /* Find pgstat entry if any */ + tmp->adw_entry = pgstat_fetch_stat_dbentry(tmp->adw_datid); + /* ! * Skip a database with no pgstat entry; it means it hasn't seen any ! * activity. */ if (!tmp->adw_entry) continue; *** table_recheck_autovac(Oid relid) *** 2258,2264 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! pgstat_clear_snapshot(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); --- 2265,2271 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! autovac_refresh_stats(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); *** AutoVacuumShmemInit(void) *** 2734,2736 --- 2741,2769 else A
Re: [HACKERS] autovacuum launcher eating too much CPU
Darcy Buskermolen wrote: > On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: > > Hi, > > > > Darcy Buskermolen noticed that when one has many databases, the autovac > > launcher starts eating too much CPU. > > Don't forget the memory leak as well. after 3 or 4 days of running I end up > with a 2GB+ AVL.. Huh, sorry for not letting you know, I already fixed that :-) (Please grab the latest CVS HEAD and confirm.) -- Alvaro Herrerahttp://www.advogato.org/person/alvherre "Entristecido, Wutra (canción de Las Barreras) echa a Freyr a rodar y a nosotros al mar" ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autovacuum launcher eating too much CPU
On Thursday 13 September 2007 09:16:52 Alvaro Herrera wrote: > Hi, > > Darcy Buskermolen noticed that when one has many databases, the autovac > launcher starts eating too much CPU. Don't forget the memory leak as well. after 3 or 4 days of running I end up with a 2GB+ AVL.. > > I tried it here with 200 databases and indeed it does seem to eat its > share. Even with the default naptime, which I wouldn't have thought > that was too high (it does make the launcher wake up about three times a > second though). > > I'm looking at a profile and I can't seem to make much sense out of it. > It seems to me like the problem is not autovac itself, but rather the > pgstat code that reads the stat file from disk. Of course, autovac does > need to read the file fairly regularly. > > Here is the top lines of gprof output. > > Comments? Is there something here that needs fixing? -- Darcy Buskermolen The PostgreSQL company, Command Prompt Inc. http://www.commandprompt.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] autovacuum launcher eating too much CPU
Alvaro Herrera wrote: > Darcy Buskermolen noticed that when one has many databases, the autovac > launcher starts eating too much CPU. > > I tried it here with 200 databases and indeed it does seem to eat its > share. Even with the default naptime, which I wouldn't have thought > that was too high (it does make the launcher wake up about three times a > second though). This patch does not solve the whole problem but it alleviates it a bit by throttling pgstat reads. One problem with it is that the interval for this increases: /* * Check whether pgstat data still says we need to vacuum this table. * It could have changed if something else processed the table while we * weren't looking. * * FIXME we ignore the possibility that the table was finished being * vacuumed in the last 500ms (PGSTAT_STAT_INTERVAL). This is a bug. */ MemoryContextSwitchTo(AutovacMemCxt); tab = table_recheck_autovac(relid); which could be a problem in itself, by causing unnecessary vacuums. Opinions? -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "In Europe they call me Niklaus Wirth; in the US they call me Nickel's worth. That's because in Europe they call me by name, and in the US by value!" Index: src/backend/postmaster/autovacuum.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.58 diff -c -p -r1.58 autovacuum.c *** src/backend/postmaster/autovacuum.c 12 Sep 2007 22:14:59 - 1.58 --- src/backend/postmaster/autovacuum.c 13 Sep 2007 16:57:59 - *** static void avl_sighup_handler(SIGNAL_AR *** 291,296 --- 291,297 static void avl_sigusr1_handler(SIGNAL_ARGS); static void avl_sigterm_handler(SIGNAL_ARGS); static void avl_quickdie(SIGNAL_ARGS); + static void autovac_refresh_stats(void); *** AutoVacLauncherMain(int argc, char *argv *** 488,494 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* Make sure pgstat also considers our stat data as gone */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ --- 489,498 DatabaseListCxt = NULL; DatabaseList = NULL; ! /* ! * Make sure pgstat also considers our stat data as gone. Note: we ! * musn't use autovac_refresh_stats here. ! */ pgstat_clear_snapshot(); /* Now we can allow interrupts again */ *** rebuild_database_list(Oid newdb) *** 836,842 HTAB *dbhash; /* use fresh stats */ ! pgstat_clear_snapshot(); newcxt = AllocSetContextCreate(AutovacMemCxt, "AV dblist", --- 840,846 HTAB *dbhash; /* use fresh stats */ ! autovac_refresh_stats(); newcxt = AllocSetContextCreate(AutovacMemCxt, "AV dblist", *** do_start_worker(void) *** 1063,1069 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! pgstat_clear_snapshot(); /* Get a list of databases */ dblist = get_database_list(); --- 1067,1073 oldcxt = MemoryContextSwitchTo(tmpcxt); /* use fresh stats */ ! autovac_refresh_stats(); /* Get a list of databases */ dblist = get_database_list(); *** table_recheck_autovac(Oid relid) *** 2258,2264 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! pgstat_clear_snapshot(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); --- 2262,2268 PgStat_StatDBEntry *dbentry; /* use fresh stats */ ! autovac_refresh_stats(); shared = pgstat_fetch_stat_dbentry(InvalidOid); dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId); *** AutoVacuumShmemInit(void) *** 2734,2736 --- 2738,2759 else Assert(found); } + + /* + * Refresh pgstats data in an autovacuum process, at most every 500 ms. This + * is to avoid rereading the pgstats files too many times in quick succession. + */ + static void + autovac_refresh_stats(void) + { + static TimestampTz last_read = 0; + TimestampTz current_time; + + current_time = GetCurrentTimestamp(); + + if (!TimestampDifferenceExceeds(last_read, current_time, 500)) + return; + + pgstat_clear_snapshot(); + last_read = current_time; + } ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] autovacuum launcher eating too much CPU
Hi, Darcy Buskermolen noticed that when one has many databases, the autovac launcher starts eating too much CPU. I tried it here with 200 databases and indeed it does seem to eat its share. Even with the default naptime, which I wouldn't have thought that was too high (it does make the launcher wake up about three times a second though). I'm looking at a profile and I can't seem to make much sense out of it. It seems to me like the problem is not autovac itself, but rather the pgstat code that reads the stat file from disk. Of course, autovac does need to read the file fairly regularly. Here is the top lines of gprof output. Comments? Is there something here that needs fixing? -- Alvaro Herrerahttp://www.advogato.org/person/alvherre "Investigación es lo que hago cuando no sé lo que estoy haciendo" (Wernher von Braun) Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls ms/call ms/call name 40.00 0.02 0.02 113360 0.00 0.00 hash_search_with_hash_value 20.00 0.03 0.0172816 0.00 0.00 AllocSetFreeIndex 20.00 0.04 0.0155517 0.00 0.00 AllocSetAlloc 20.00 0.05 0.01 4247 0.00 0.00 choose_nelem_alloc 0.00 0.05 0.00 396779 0.00 0.00 pg_isblank 0.00 0.05 0.00 119276 0.00 0.00 calc_bucket 0.00 0.05 0.00 113360 0.00 0.00 hash_search 0.00 0.05 0.00 113330 0.00 0.00 hash_uint32 0.00 0.05 0.00 113330 0.00 0.00 oid_hash 0.00 0.05 0.00 105035 0.00 0.00 get_hash_entry 0.00 0.05 0.0055514 0.00 0.00 MemoryContextAlloc 0.00 0.05 0.0041798 0.00 0.00 next_token 0.00 0.05 0.0025512 0.00 0.00 DynaHashAlloc 0.00 0.05 0.0016187 0.00 0.00 TimestampDifferenceExceeds 0.00 0.05 0.00 8742 0.00 0.00 MemoryContextStrdup 0.00 0.05 0.00 8511 0.00 0.00 seg_alloc 0.00 0.05 0.00 8507 0.00 0.00 my_log2 0.00 0.05 0.00 8379 0.00 0.00 read_pg_database_line 0.00 0.05 0.00 8364 0.00 0.00 check_list_invariants 0.00 0.05 0.00 8353 0.00 0.00 lappend 0.00 0.05 0.00 8322 0.00 0.00 backend_read_statsfile 0.00 0.05 0.00 8322 0.00 0.00 pgstat_fetch_stat_dbentry index % timeself childrencalled name [1] 99.60.000.05 reaper [1] 0.000.05 1/1 StartAutoVacLauncher [3] 0.000.00 1/1 load_role [51] 0.000.00 2/3 StartChildProcess [238] 0.000.00 1/60 errstart [179] 0.000.00 1/1 AutoVacuumingActive [289] --- 0.000.05 1/1 StartAutoVacLauncher [3] [2] 99.50.000.05 1 AutoVacLauncherMain [2] 0.000.05 56/56 launch_worker [5] 0.000.00 1/1 rebuild_database_list [27] 0.000.00 2/2 errmsg [78] 0.000.00 2/2 errfinish [101] 0.000.00 1/1 BaseInit [115] 0.000.00 1/4330AllocSetContextCreate [24] 0.000.00 118/118 PostmasterIsAlive [165] 0.000.00 59/59 launcher_determine_sleep [181] 0.000.00 59/59 pg_usleep [182] 0.000.00 59/254 LWLockAcquire [157] 0.000.00 59/254 LWLockRelease [158] 0.000.00 58/288 GetCurrentTimestamp [156] 0.000.00 58/16187 TimestampDifferenceExceeds [135] 0.000.00 10/22 pqsignal [198] 0.000.00 2/60 errstart [179] 0.000.00 1/1 init_ps_display [383] 0.000.00 1/1 InitAuxiliaryProcess [308] 0.000.00 1/116 MemoryContextSwitchTo [166] 0.000.00 1/1 autovac_balance_cost [368] 0.000.00 1/1 proc_exit [403] --- 0.000.05 1/1 reaper [1] [3] 99.50.000.05 1 StartAutoVacLauncher [3] 0.000.05 1/1 AutoVacLauncherMain [2] 0.000.00 1/4