> 1. How does AFS 3.4 handle the loss of a database server? Can
> read-only volumes continue to be read, as long as they are accessible?
> What experience have users had with this? AFS read-only volumes could
> solve some of our data availability requirements if the implementation is
> robust.
It's been my experience that AFS deals quite well with the loss of a
database server, as long as there are others to pick up the slack. If
you're interested in any amount of fault tolerance, you want to have at
least 3 database servers (for various reasons, the voting mechanism
breaks if there are two servers and one of them goes down).
It deals reasonably well with replicated RO volumes becoming available,
again as long as there are other copies. There is a bit of failover time,
but the cache manager will remember that a particular server has failed
and continue to access the RO volume from another copy.
> 2. Does anyone have experience with AFS 3.4 and Sun Online Disk Suite
> under Solaris 2.4? High availability of rw volumes is important. How
> about AFS and RAID?
In general, AFS fileserver partitions can only be located on UFS filesystems.
As long as that requirement is met, I don't see any problems. We haven't
tried putting a vice partition on a RAID device yet, but we do have a
project group that's mildly interested in trying it.
I would stay away from Solaris fileservers, at least for a while; people
I've talked to have indicated that they've had problems with the servers
on Solaris machines. Others on this list might tell you differently,
though; YMMV.
> 4. Has AFS 3.4 improved on the caching access model? That is to say,
> does 3.4 still have the "one user, one workstation" access pattern in
> mind? Would it be acceptable in terms of performance and reliability to
> have the HTTP clients read off of read-only volumes, or would it be
> preferable to simply use a utility like synctree to copy the read-only
> data to local disk on each server? Disk is cheap, so I am leaning
> towards the latter, unless someone can demonstrate otherwise.
The problem with using something like synctree is that you'll have to
resynchronize periodically, and clients don't see changes until you do.
Furthermore, if you have several WWW servers (afs clients), you'll want
to take great care to stagger when they synchronize; otherwise they'll
all be doing the same (fairly heavy) AFS access at the same time, which
will give you interesting usage peaks.
Rob Earhart ([EMAIL PROTECTED]) has developed an HTTP server which
talks directly to the AFS fileservers, bypassing Transarc's cache manager.
This results in improved performance for the WWW server, and permits
other work to get done on the machine running the server. I understand
he's presenting some of this work at Decorum '96.
-- Jeffrey T. Hutzelman (N3NHS) <[EMAIL PROTECTED]>
Systems Programmer, CMU SCS Research Facility
Please send requests and problem reports to [EMAIL PROTECTED]