Re: e-mail over AFS !

Lyle_Seaman Thu, 7 Jul 1994 10:35:33 -0400
-  It seems that there is an implicit assumption that a mail spool area
should consist of a single directory which contains a single file per
user, which contains that user's mail.   This is a holdover from
antiquity, and clearly does not scale well to systems which support
thousands of users.  Furthermore, it's ridiculously simple to change to
a hierarchical model, in which case, the issue of file-level access
control becomes rather moot. 

Excerpts from transarc.external.info-afs: 6-Jul-94 Re: e-mail over AFS !
Steve [EMAIL PROTECTED] (1643) 

> If people are truncating their mail files (rather than deleting them), 
> then the directory itself doesn't change all that often.  There are 
> callbacks on the individual file -- but presumably only you and the 
delivery agent write to it, and no one else is trying to read it (:-)} 

bingo. 

Excerpts from transarc.external.info-afs: 6-Jul-94 Re: e-mail over AFS !
Steve [EMAIL PROTECTED] (1643) 

> DFS will behave differently in some respects, not least because of the 
> new token management scheme for obtaining read or write permission on 
> a file or directory.  More details would have to come from someone more 
> familiar with Episode or token manager internals. 

I'm not sure what you're hinting at here, Steve.  I don't see what
difference token management makes.  Ugh, maybe I do.  It's _possible_
that you might implement a mail system on top of DFS which uses
/usr/spool/mail files, one per user, with multiple delivery agents, each
of which can append mail to the same user's file.  Ugh.  Let's pitch
this idea of a single spool file, shall we?  It's got nasty performance
implications, and you can only keep 2 GB of old mail around without
creating a new mail file. 

Excerpts from transarc.external.info-afs: 6-Jul-94 Re: e-mail over AFS !
Bob [EMAIL PROTECTED] (1604) 

> Additionally, I think where AMS took its biggest performance hit wasn't 
> in the mail service per se, but in integrating BBSs and netnews into a 
> common massaging system.  When hundreds of individuals browsed thousands 
> of files from common volumes, you had tremendous strain on the servers 
> -- especially since the BBSs were dynamic and writable, enforcing 
callbacks from all active clients whenever a news article changed. 

 -- news articles never change -- 

I agree that this was a problem at CMU, but it was greatly aggravated by
a few AFS bugs, mostly fixed now (but not in the andrew servers, more's
the pity).  The issue wasn't so much thousands of files from common
volumes, but thousands of files from common directories...   The files
themselves never change, so the server doesn't have to break callbacks
on them.  The problem is that the *directories* change, frequently, and
directly proportionally to the *popularity* of the particular bboard. 
Incidentally, the popularity of a bboard is directly proportional to the
*size* of the directory (yep,  a single directory per bboard), which is
directly proportional to the cost of re-fetching the directory in order
to lookup subsequent files.   For some of the most popular bboards (you
know which ones) at peak times the interval between updates to the
directory was shorter than the time to fetch the directory from the
server.  In other words, every lookup required every cache manager to
fetch a large directory from the server.   Incidentally, the DFS cache
manager does not fetch entire directories to do lookups, it uses a
lookup RPC to the file server.   There are two obvious ways to deal with
this entire constellation of problems. 

One is to turn some or all of the bboard volumes into read-only clones. 
Thus, no callbacks. What's more, you can replicate the popular bboards
on multiple servers.  Apparently, the andrew folks had some problems
with making bboards read-only:  
   1.  users expect to see their postings appear soon after making them,
so the release interval must be reasonably short (like, 15 minutes).  
It seems to me that this expectation could be changed. 
   2.  The process of creating a read-only clone of a busy volume
apparently overloaded the file server.  I believe this was partly due to
a combination of two factors -- (a) the fact that the read-write parent
volume was inaccessible for the duration of a clone prior to AFS 3.3
(3.2? 3.2a? I can never remember which releases contain what...), which
would cause a backlog of requests for a volume queueing up pending the
completion of the clone, and a subsequent load spike, and (b) the
"meltdown" bug fixed in AFS 3.3 (3.2b?) which could be triggered by such
a load spike, if it was large enough.  
Since these problems have been fixed, it would be interesting to see the
results of a repeated experiment, unlikely as that may be. 

The other solution, which was chosen by the andrew admins, is to simply
disable updates to the most popular bboards during peak times, batching
them up for off-peak processing.  

Excerpts from transarc.external.info-afs: 6-Jul-94 Re: e-mail over AFS !
Bob [EMAIL PROTECTED] (1604) 

> The one unavoidable bottleneck in distributing a large-scale mail 
> service within DFS might be in serializing new mail delivery. 
> Optimally, you'd like to use a single address for mail delivery, which 
> means that a single host would be responsible for injecting everybody's 
> new mail into DFS.  I'd think that this could still scale to thousands 
> of uses -- but tens of thousands? 

I don't see why you would need (or want) to serialize mail delivery, or
why a single address is optimal, or...   We actually have two mail
servers now.  

Excerpts from transarc.external.info-afs: 6-Jul-94 Re: e-mail over AFS !
Bob [EMAIL PROTECTED] (1604) 

> AMS employs a slick method of checking for new 
> mail without involving direct server access. 

I'm not familiar with this method.  Could someone elucidate, please?
Re: e-mail over AFS !

Reply via email to