Karl M. Davis wrote: > Hey there all, > > > > I just recently set up the Debian openafs 1.4.4 packages on an Ubuntu > server box, running in a virtual machine. It’s monsoon season here in > Tucson and we’ve had a couple of long power outages and problems with > the UPS. Both times the server has gone done unexpectedly, AFS didn’t > come back up correctly. The symptoms I note are that “ls /afs” returns > empty on the server and the Windows client can’t connect. > > For whatever reason, the thing that has fixed it both times is running > “fs checkvolumes”. Of course, “fs checkvolumes” segfaults when I run > it, but if I reboot after that, everything comes back up fine, clients > can connect, and further “fs checkvolumes” don’t segfault. Rebooting > before running that specific command (with the segfault) does > nothing—“ls /afs” still returns empty. > > > So… a couple of questions: > > How do I ensure AFS can survive a power outage/unexpected poweroff > without getting borked? > > If it does get borked, why would a segfaulting “fs checkvolumes” fix things?
fs checkvolumes doesn't really check anything. It instructs the AFS cache manager to invalidate its knowledge of all of the volume location information thereby forcing the data to be reloaded from the volume database servers. If you are not using dynroot on UNIX or freelance on Windows, if the file servers are all down or if all of the copies of the 'root.afs' volume are offline when the client starts, the client will be unable to mount the volume. In the case of the Windows clients they will stop with a panic condition that is logged to %WinDir%\temp\afsd_init.log If you file a bug report to [EMAIL PROTECTED] with a stack trace for the segfault on Linux someone can attempt to fix that. My guess is that it is failing because the volume list is empty or some boundary condition like that. I have no idea how/why "fs checkvolumes" segfaulting would be a requirement for subsequent access. Jeffrey Altman
smime.p7s
Description: S/MIME Cryptographic Signature
