Every time a new-to-afs user gets an account on one of our AFS client systems,
they invoke ls on /afs to see what's in it and are hit with interminable
delays while multiple cells prove to be unreachable.  Of course this is
exacerbated by the fact that virtually everyone aliases ls to ls -F.  Once 
they've experienced these delays, I find it more challenging than otherwise
necessary to convince someone of the practical benifits of using AFS.

I have built [limited] infrastructure to keep our client CellServDB files
up-to-date with Transarc's, I have (semi-)regularly updated the cells mounted
in our root.cell volume, and  I have installed cron entries to frequently list
/afs on timesharing systems so information is cached.  Unfortunately, this
is not good enough because there are always some inaccessible sites and the
client code doesn't [appear to] cache information about failed attempts to
access a volume in another cell.  Note that the difference between 2+ minutes 
on an up-to-date client and 5+ minutes on an unmaintained client is moot to
the user who interrupts the command after a minute or less.

A database server that we took out of our published entry two years ago is
still getting regular probes so I am confident that we are not alone in
this problem.  Clearly, I can build a script to periodically check 
listed cells, unmount unresponsive sites, and release the resulting 
root cell.  Equally clearly this is not a reasonable solution.  Does anyone
have a better one?

Thanks,
-Charles Ball

Reply via email to