i promised to post this a month ago, but i've been sick. the recent
posting from Brian Buhrow reminded me about this.
--
Chuck Lever - [EMAIL PROTECTED]
U-M ITD Login service team
------- Forwarded Message
Date: Fri, 05 Dec 1997 11:17:44 -0500
From: Chuck Lever <cel>
To: [EMAIL PROTECTED]
Subject: Re: BIG afs caches, chunk size, number of files, etc.
bruewer says:
< Yes, I am interested in this topic, too. Don't you think that a
< large server with e. g. 50 concurrent users with their home
< directories in AFS could utilize lots of cache?
yes, such users could use lots of cache, but do they have to?
it depends on several things. first, let me describe some of
the client environments that i have some responsibility for.
"interactive login" clients. these are Sun Ultra 170s
and support up to 200 concurrent users each. all home
directories are in AFS. mostly e-mail users.
web servers. all web pages are in AFS. millions of hits
each per month on several servers.
computation servers. users log in and run intensive jobs.
usually about 20 users logged in to an Ultra 2300.
my login clients have a 64M cache with about 2000 chunks. because
there are 16 of these machines in a DNS pool, there is no guarantee
that a user will use the same client every time; sessions tend to
be short, and hit about 10 files (dot files) during log in.
the high turnover rate is a killer -- chunk files are reused
constantly, and there is no benefit for keeping the home directory
files in the cache, since the user may not use the same client
for days.
the computation server has a 300M cache, but also a small number
of files (3000?). the chunk size is large, so we can fill the cache
on the machine; the cache turnover rate is low, so we aren't hit
very often with the overhead of truncating a large chunk file when
we need to re-use it.
so, the cache should be large enough to hold constantly used files
and the "working set" of files needed by users on the machines. in
other words, all 200 users on a login machine should be able to
access their files from the AFS cache while they're logged in --
that's as low an access rate for server files as you can get.
anything larger is unecessary overhead.
i've verified that smaller caches perform measurably better with
a series of SPEC benchmarks. but there are really 3 areas that
slow down cache access:
1. directory lookups in the cache directory,
2. truncation of chunk files when they are re-used, and
3. inefficient utilization of kernel data structures
the idea when sizing the cache is to minimize the effects of all
three of these factors. simply, the best approach is to keep the
cache as small as you can without hitting the servers too hard,
although you can tune large caches with these 3 factors in mind.
on a large server, your best performance bet is having a very fast
cache disk. on my computation server, the cache partition is a
3-way striped fiber attached metadevice. the web servers have their
cache stored on a single 7200RPM drive with other partitions on
the same physical device being used for low-utilization data.
--
Chuck Lever - [EMAIL PROTECTED]
U-M ITD Login service team
------- End of Forwarded Message