> And as amd uses symlinks it even can point processes opening a file
> with an absolute path to another server if the first hangs (amd uses
> ping to determine if a server is up and changes the link if the server
> goes down - unfortuanately, if only the nfsd is down that gives a
> wrong clue). When a file already is open the process will hang - and
> such a failover could only be done inside nfs (i.e., not on linux at
> the moment). Of course this works only good with ro mounted filesystems.

Are you sure that amd doesn't perform an null rpc request to proc nfs
rather than a ping?  Elsewhere the code certainly seems to use this,
and that should certainly detect the nfsd being down.

Of course this kind of switchover won't help processes which are
already using files mounted from a previous server since they will
already be blocked on it comming back.

I exchanged some e-mail with one of the Sun people who implemented
their nfs failover stuff.  It only works for readonly filesystems (for
obvious reasons) and requires the client to hold the pathnames so that
a new lookup can take place to replace the handles obtained from the
previous server and of course the file systems must contain identical
files/directory layout or it cannot work (though they don't have to be
identical inodes).  I'd prefer a solution which didn't require client
changes (e.g. some kind of failover server which inherits the handles
from other machines) but that seems to be much harder to implement.

 -- Jon
/

Reply via email to