If either of you could weigh in on AFS on top of DRBD i'd appreciate it, I'm not fully up on whether a second server with an identical filesystem could be made to take over a crashed AFS machine.
There are a couple of issues1) would DRBD actually notice that the storage device on the primary node is "hanging" and switch over to the secondary? I didn't think that most heartbeat services would catch this 2) there would be a significant delay bringing up the secondary node as a fileserver: Since the volumes were likely to all be "attached" (in use) by the primary node's fileserver at the time of the failover, the DRBD partition would need to be salvaged (a secondary fsck of the afs metadata) before the fileserver could be started on the secondary machine. 3) you would need to do some sort of IP address takeover in order for clients to contact the correct machine to get at the data. (The AFS architecture provides better ways to do this, but the tools for using them in a case like this aren't there at the moment)
p7sVbIQeeLXGJ.p7s
Description: S/MIME cryptographic signature
