On Thu, 16 Aug 2012 10:22:27 -0400 Bob Hoffman <[email protected]> wrote:
> A number of clients persist in saying "Connection timed out" even after > the volumes were brought on-line on the new server. Clients cache volume location information for 2 hours. They will continue to think that the volumes are on the old server until you invalidate the cache, or they receive a certain type of error. "The server is not responding" is not one of those errors. > Here is what I've tried so far with no effect whatsoever: > > fs flushmount /afs/cs.pitt.edu/projects/cast > fs flushmount /afs/cs.pitt.edu/projects > fs flushmount /afs/.cs.pitt.edu/projects/cast > fs flushmount /afs/.cs.pitt.edu/projects > fs flushvolume /afs/cs.pitt.edu/projects/cast > fs flushvolume /afs/cs.pitt.edu/projects > fs flushvolume /afs/.cs.pitt.edu/projects/cast > fs flushvolume /afs/.cs.pitt.edu/projects > vos release projects > ls -l /afs/cs.pitt.edu/projects > ls -l /afs/.cs.pitt.edu/projects Try 'fs checkvolumes'. > Is there anything I can do, short of a client reboot, to fix this? > Shouldn't AFS have a more graceful recovery in this kind of situation? > Why doesn't the client see that the volume has moved to a new server? It could in theory recheck the vldb in this scenario, but there are other issues with doing that, since the majority of the time such errors are encountered when the volume hasn't moved or anything. If this situation lasted for more than 2 hours and/or survived an 'fs checkv', that's a problem. For that, you can capture some debug data like so: fstrace clear cm fstrace setlog cmfx -buffers 1024 fstrace sets cm -active ls /afs/cs.pitt.edu/projects/cast & echo $! wait fstrace dump cm > /tmp/fstrace.log fstrace sets cm -inactive -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
