Jeffrey,
I appreciate your lengthy reply, you've confirmed many of the things I was wondering about. The big issue when it comes to the server situation is that a disk dying will infact kill the entire server as these are low budget whiteboxes with basic SATA controllers, nothing particularly impressive. From John Hascall's post i am extremely interested in using DRBD to effectively distribute any filesystem updates, this seems a more appropriate solution for my needs, because unfortunately I don't have access to 'proper' servers, and the Linux support for the SATA controller on these motherboards (yes, i know, embedded controllers are satan) is extremely poor.

Many thanks for taking the time to help me. I believe I may even attempt to combine DRBD with AFS because we will shortly be opening a second staffed site, meaning i will require some form of 'Global Filesystem' if you will (no implication of GFS).

Thanks again,
Paul


Jeffrey Altman wrote:
Paul Robins wrote:


Well that's what i was originally wondering, can AFS provide the ability
to replicate the contents of one fileserver to others which can be used
redundantly. It appears not at all; I'd still like to use AFS but I do
think i'm going to have to go NFS and then some sort of faux raid 1 for
redundancy.


Paul:

The real question you have to answer is what risks are you concerned
about?   What is the likelihood that you are going to lose an entire
server without warning in such a manner that it makes a difference to
the clients that would be communicating with it?

The reason I specify "without warning" is that AFS far surpasses the
capabilities of other file systems in the area of volume management.
You said earlier in the thread that your biggest fear was losing a
disk.   So we can make that your warning sign.  For each file server
you deploy use mirrored disks (RAID-1) on which each disk is on its
own interface card.   Then deploy your file servers and leave enough
empty space on each of the servers such that if necessary you can
move all of the volumes on any one server to any of the other servers.

Now if a disk ever fails the operation of the file server will be
uninterrupted.   You can then initiate volume moves of the
non-replicated read-write volumes to other servers.  These moves can
be performed while the clients are actively using them.  The clients
will continue using the source server until the move is almost complete,
there will be a brief busy state where the client waits, and then a
moved notification which the client responds to by looking up the new
location and continuing where it left off on the new server.

Once all of the volumes have been moved off the server, you can take
the server down and replace the disk or perform whatever form of
maintenance that is required.

In the recent past I have seen more outages caused for end users by
a need to reconfigure non-Andrew file systems either for volume
redistribution or physical maintenance than I have for physical failures
 in AFS deployments.   AFS volume management allows you to perform more
frequent maintenance of the hardware and the OS without impacting
end users then other models.

While a network based RAID-5 is a fine idea, the performance is really
going to be quite poor from the perspective of end users even when the
machines are physically quite close.   Network RAIDs have the potential
to provide redundancy when whole portions of the network infrastructure
are lost.  However, they do so at a significant cost in performance.

Jeffrey Altman

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to