I've been using the ARC debug options to analyse memory usage on the PostgreSQL 8.0 server. This is a precursor to more complex performance analysis work on the OSDL test suite.
I've simplified some of the ARC reporting into a single log line, which is enclosed here as a patch on freelist.c. This includes reporting of: - the total memory in use, which wasn't previously reported - the cache hit ratio, which was slightly incorrectly calculated - a useful-ish value for looking at the "B" lists in ARC (This is a patch against cvstip, but I'm not sure whether this has potential for inclusion in 8.0...) The total memory in use is useful because it allows you to tell whether shared_buffers is set too high. If it is set too high, then memory usage will continue to grow slowly up to the max, without any corresponding increase in cache hit ratio. If shared_buffers is too small, then memory usage will climb quickly and linearly to its maximum. The last one I've called "turbulence" in an attempt to ascribe some useful meaning to B1/B2 hits - I've tried a few other measures though without much success. Turbulence is the hit ratio of B1+B2 lists added together. By observation, this is zero when ARC gives smooth operation, and goes above zero otherwise. Typically, turbulence occurs when shared_buffers is too small for the working set of the database/workload combination and ARC repeatedly re-balances the lengths of T1/T2 as a result of "near-misses" on the B1/B2 lists. Turbulence doesn't usually cut in until the cache is fully utilized, so there is usually some delay after startup. We also recently discussed that I would add some further memory analysis features for 8.1, so I've been trying to figure out how. The idea that B1, B2 represent something really useful doesn't seem to have been borne out - though I'm open to persuasion there. I originally envisaged a "shadow list" operating in extension of the main ARC list. This will require some re-coding, since the variables and macros are all hard-coded to a single set of lists. No complaints, just it will take a little longer than we all thought (for me, that is...) My proposal is to alter the code to allow an array of memory linked lists. The actual list would be [0] - other additional lists would be created dynamically as required i.e. not using IFDEFs, since I want this to be controlled by a SIGHUP GUC to allow on-site tuning, not just lab work. This will then allow reporting against the additional lists, so that cache hit ratios can be seen with various other "prototype" shared_buffer settings. Any thoughts? -- Best Regards, Simon Riggs
Index: freelist.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/freelist.c,v retrieving revision 1.48 diff -d -c -r1.48 freelist.c *** freelist.c 16 Sep 2004 16:58:31 -0000 1.48 --- freelist.c 22 Oct 2004 18:15:38 -0000 *************** *** 126,131 **** --- 126,133 ---- if (StrategyControl->stat_report + DebugSharedBuffers < now) { long all_hit, + buf_used, + b_hit, b1_hit, t1_hit, t2_hit, *************** *** 155,161 **** } if (StrategyControl->num_lookup == 0) ! all_hit = b1_hit = t1_hit = t2_hit = b2_hit = 0; else { b1_hit = (StrategyControl->num_hit[STRAT_LIST_B1] * 100 / --- 157,163 ---- } if (StrategyControl->num_lookup == 0) ! all_hit = buf_used = b_hit = b1_hit = t1_hit = t2_hit = b2_hit = 0; else { b1_hit = (StrategyControl->num_hit[STRAT_LIST_B1] * 100 / *************** *** 166,181 **** StrategyControl->num_lookup); b2_hit = (StrategyControl->num_hit[STRAT_LIST_B2] * 100 / StrategyControl->num_lookup); ! all_hit = b1_hit + t1_hit + t2_hit + b2_hit; } errcxtold = error_context_stack; error_context_stack = NULL; ! elog(DEBUG1, "ARC T1target=%5d B1len=%5d T1len=%5d T2len=%5d B2len=%5d", T1_TARGET, B1_LENGTH, T1_LENGTH, T2_LENGTH, B2_LENGTH); ! elog(DEBUG1, "ARC total =%4ld%% B1hit=%4ld%% T1hit=%4ld%% T2hit=%4ld%% B2hit=%4ld%%", all_hit, b1_hit, t1_hit, t2_hit, b2_hit); ! elog(DEBUG1, "ARC clean buffers at LRU T1= %5d T2= %5d", t1_clean, t2_clean); error_context_stack = errcxtold; --- 168,187 ---- StrategyControl->num_lookup); b2_hit = (StrategyControl->num_hit[STRAT_LIST_B2] * 100 / StrategyControl->num_lookup); ! all_hit = t1_hit + t2_hit; ! b_hit = b1_hit + b2_hit; ! buf_used = T1_LENGTH + T2_LENGTH; } errcxtold = error_context_stack; error_context_stack = NULL; ! elog(DEBUG1, "shared_buffers used=%8ld cache hits=%4ld%% turbulence=%4ld%%", ! buf_used, all_hit, b_hit); ! elog(DEBUG2, "ARC T1target=%5d B1len=%5d T1len=%5d T2len=%5d B2len=%5d", T1_TARGET, B1_LENGTH, T1_LENGTH, T2_LENGTH, B2_LENGTH); ! elog(DEBUG2, "ARC total =%4ld%% B1hit=%4ld%% T1hit=%4ld%% T2hit=%4ld%% B2hit=%4ld%%", all_hit, b1_hit, t1_hit, t2_hit, b2_hit); ! elog(DEBUG2, "ARC clean buffers at LRU T1= %5d T2= %5d", t1_clean, t2_clean); error_context_stack = errcxtold;
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match