> And as amd uses symlinks it even can point processes opening a file > with an absolute path to another server if the first hangs (amd uses > ping to determine if a server is up and changes the link if the server > goes down - unfortuanately, if only the nfsd is down that gives a > wrong clue). When a file already is open the process will hang - and > such a failover could only be done inside nfs (i.e., not on linux at > the moment). Of course this works only good with ro mounted filesystems. Are you sure that amd doesn't perform an null rpc request to proc nfs rather than a ping? Elsewhere the code certainly seems to use this, and that should certainly detect the nfsd being down. Of course this kind of switchover won't help processes which are already using files mounted from a previous server since they will already be blocked on it comming back. I exchanged some e-mail with one of the Sun people who implemented their nfs failover stuff. It only works for readonly filesystems (for obvious reasons) and requires the client to hold the pathnames so that a new lookup can take place to replace the handles obtained from the previous server and of course the file systems must contain identical files/directory layout or it cannot work (though they don't have to be identical inodes). I'd prefer a solution which didn't require client changes (e.g. some kind of failover server which inherits the handles from other machines) but that seems to be much harder to implement. -- Jon /
