Hi -
There are several vendors now offering "failsafe" capabilities. For
instance, having a dual ported dual redundant RAID controller, so that
two systems can access (different) RAID sets on the same set of RAID
controllers, and if one server fails, the RAID sets can be mounted on
the other server and ip addresses get pushed around, etc. etc. and
life goes on with users realizing a system has failed. (Well,
that's the theory at least).
The powers that be think this is all great stuff and want to be
able to have failover ability between two AFS servers. That way,
users home directories will always be available.
With AFS this seems to be possible.
In theory, you would do something like (under AFS 3.4):
o mount the RAIDs from the failed machine on the good machine
o restart bosserver and fs processes
o run vos syncvldb <good server> <just mounted partitions>
o may need to run vos changeaddr <failed ip> <good server ip>
(Putting things back when the server is back online is almost, but not
quite the reverse process).
Since AFS isn't really designed for this type on situation (as far
a I know), I was wondering if anyone has actually tried failing over a
server, and how successful you've been in doing so.
thanks
-dave