cache size discussions

Chuck Lever Fri, 9 Jan 1998 23:47:31 +0100 (MET)

   i promised to post this a month ago, but i've been sick.  the recent
   posting from Brian Buhrow reminded me about this.

--
        Chuck Lever - [EMAIL PROTECTED]
          U-M ITD Login service team

------- Forwarded Message

Date:    Fri, 05 Dec 1997 11:17:44 -0500
From:    Chuck Lever <cel>
To:      [EMAIL PROTECTED]
Subject: Re: BIG afs caches, chunk size, number of files, etc. 


bruewer says:
<  Yes, I am interested in this topic, too. Don't you think that a 
<  large server with e. g. 50 concurrent users with their home 
<  directories in AFS could utilize lots of cache?

   yes, such users could use lots of cache, but do they have to?
   it depends on several things.  first, let me describe some of
   the client environments that i have some responsibility for.

        "interactive login" clients.  these are Sun Ultra 170s
        and support up to 200 concurrent users each.  all home
        directories are in AFS.  mostly e-mail users.

        web servers.  all web pages are in AFS.  millions of hits
        each per month on several servers.

        computation servers.  users log in and run intensive jobs.
        usually about 20 users logged in to an Ultra 2300.

   my login clients have a 64M cache with about 2000 chunks.  because
   there are 16 of these machines in a DNS pool, there is no guarantee
   that a user will use the same client every time; sessions tend to
   be short, and hit about 10 files (dot files) during log in.

   the high turnover rate is a killer -- chunk files are reused
   constantly, and there is no benefit for keeping the home directory
   files in the cache, since the user may not use the same client
   for days.

   the computation server has a 300M cache, but also a small number
   of files (3000?).  the chunk size is large, so we can fill the cache
   on the machine; the cache turnover rate is low, so we aren't hit
   very often with the overhead of truncating a large chunk file when
   we need to re-use it.

   so, the cache should be large enough to hold constantly used files
   and the "working set" of files needed by users on the machines.  in
   other words, all 200 users on a login machine should be able to
   access their files from the AFS cache while they're logged in --
   that's as low an access rate for server files as you can get.
   anything larger is unecessary overhead.  

   i've verified that smaller caches perform measurably better with
   a series of SPEC benchmarks.  but there are really 3 areas that
   slow down cache access:

1.  directory lookups in the cache directory,

2.  truncation of chunk files when they are re-used, and

3.  inefficient utilization of kernel data structures

   the idea when sizing the cache is to minimize the effects of all
   three of these factors.  simply, the best approach is to keep the
   cache as small as you can without hitting the servers too hard,
   although you can tune large caches with these 3 factors in mind.

   on a large server, your best performance bet is having a very fast
   cache disk.  on my computation server, the cache partition is a
   3-way striped fiber attached metadevice.  the web servers have their
   cache stored on a single 7200RPM drive with other partitions on
   the same physical device being used for low-utilization data.

--
        Chuck Lever - [EMAIL PROTECTED]
          U-M ITD Login service team

------- End of Forwarded Message
cache size discussions

Reply via email to