ddaniel says:
< I can see how this is easy to write, but which cache parameters are
< you talking about ajdusting, and how? My understanding is that you
< want the cache to be the size of the typical 'working set' of data the
< users on that machine use over some period like a day. Theoretically,
< this working set is a window of data that moves slowly enough through
< time that cache hits are very frequent.
yes, i assume that model as well.
< Are you talking about
< measuring the working set? Would it used to set -dcache, -volumes,
< -chunksize, cachesize, -stat...? How do you determine the cache hit
< rate?
essentially, i measure the average size of the cache files, the
average number of non-empty cache files, and the average age of
the files. knowing how many users are on the machine, and an
average number of files per user also helps. finally, i also
measure how many files are chunk-sized rather than smaller than
a chunk.
i've also used the SPEC-SDET benchmark to determine a set of cache
options that prevent performance from degrading over time. this
has to do with corruption of the hash tables that the cache manager
uses to keep track of AFS vnodes and cache files, as well as what
i'm assuming is fragmentation of the CacheItems file. i studied the
hash function and found that keeping -dcache not more than 500 and
-stat not more than 8000 will prevent the hash function instability.
also, keeping the number of chunk files down will reduce the hit
taken for doing un-hashed UFS directory lookups in the local cache.
using this information, and knowing what to avoid from experience,
i set the the options you mention above. for example, on an ethernet
attached Sparc 20 with 128M of RAM and a hyperSPARC CPU running
about 100 pine users, i use these cache parameters:
-blocks 32768 -chunksize 18 -daemons 3 -dcache 500 -files 1800 -stat 1800
-volumes 512 -nosettime
on an Ultra 1 170 workstation with a couple of statistical users, i
use these parameters:
-blocks 49152 -chunksize 19 -daemons 2 -dcache 500 -files 1800 -stat 1800
-volumes 192 -nosettime
on david's web servers, we found these cache parameters work well:
-blocks 155000 -chunksize 15 -files 5000 -stat 2000 -dcache 500 -daemons 8
-volumes 1024 -nosettime
on a FDDI attached 4 CPU Sparc 1000 serving a small community of
computationally intensive users, i use:
-blocks 393216 -chunksize 21 -daemons 7 -dcache 500 -files 7700 -stat 1800
-volumes 768 -nosettime
< > AFS home directories) can get by great with a 32M cache and 1800 files.
< > using the same tools i used on these machines, i looked at dave's
<
< Really? I thought the working set of that many users would be much
< larger.
the average file size is small, and the average number of files per
user is small. now, if our user community were MH users, i'm sure
this would be different -- the average file size would still be
small, but each user would have many more files.
< > the biggest problem with AFS caches is the number of disk writes.
< > these are generally slow and synchronous (even though they are
< > buffered), so if you are hitting the cache hard, disk write request
< > queuing will slow you down. one of the performance problems with
<
< We have a web server which is almost exclusively read-only with
< respect to AFS. Does this change the situtation enough that larger
< cache sizes (>150MB) will indeed be more helpful?
depends. if you are serving lots of large files, then increasing
-chunksize and -blocks will probably increase the amount of data
served directly from the cache. if you are serving lots of small
files, increasing -files will hit the non-hashed UFS directory lookup
problem.
i remember that on dave's server the average cache file size was
smaller than we had expected, given the propensity of graphical
data in web pages. that's why his chunksize is 15 (32K). decreasing
chunksize is not really necessary to improve performance; if i remember,
reducing the chunksize helps prevent the cache from overflowing given
the number of blocks and cache files we specified.
< > a) using memory cache (still see heavy write request load;
< > haven't figured out why)
<
< I remember reading a paper about this. I wish I could remember which
< one. Basically, I think the code path to look up this information is
< very long and unoptimized (no hints).
well, i'm very interested in investigating the performance of memory-
only or tmpfs caches. i think those would be the best performing
caches for everything but supercomputers with huge datasets.
Chuck Lever - [EMAIL PROTECTED]
U-M ITD Login service team