I have seen this problem on most architectures (I haven't seen it on the RISC/6000 simply because I haven't had logredo errors). Basically, the problem is that with a clean shutdown, the bosserver doesn't believe that it needs to start a salvager on the next boot. On everything but the RISC/6000, the AFS fsck (vfsck) runs and if it encounters a problem, it will make a /FORCESALVAGE file at the top-level of the affected filesystem. However, there is a problem with this methodology because the rest of the Transarc software doesn't look for this. I have modified the fileserver source code to look for this as it starts up and if it encounters it, it will clean up what it has started and exit (the bosserver then sees this exit and starts the salvager, which will clean it up). I didn't add filesystem specific knowledge to the bosserver intentionally, since that is only a process-management program. For the RISC/6000, I have not yet developed a solution because of how the filesystem is handled (it relies heavily on the operating system). I have supplied patches for the other platforms to Transarc about a year ago, and we have been running with it on all our servers at MIT for about a year. -Richard ------- Forwarded transaction [1856] [EMAIL PROTECTED] (Mitch Collinsworth) Info-AFS_Redistribution 07/14/93 20:07 (28 lines) Subject: Re: AIX logredo process messes up /vicep? ??? To: "Frank Swasey" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] In-Reply-To: Your message of "Wed, 14 Jul 93 10:02:32 EDT." Date: Wed, 14 Jul 93 18:47:20 -0400 From: Mitch Collinsworth <[EMAIL PROTECTED]> >I had a situation here today where AIX decided noone needed to log into >one of my servers. Since I was able to communicate with the bosserver >on this machine, I issued a bos shutdown -wait (and waited for it to >finish) prior to resetting the machine. When the machine rebooted, >logredo ran for all of the /vicepX partitions. Since I had done a >proper AFS shutdown, salvager was not run. However most of the volumes >on 8 of the 10 partitions were offline until I manually ran salvage on >the machine. Has anyone seen this problem or one like it before? I guess I have seen one like it. I don't know what logredo is and I haven't used AIX systems for servers yet. When I was using a VAX for the server everytime it went down unexpectedly it would come up and run vfsck and then come online. I don't recall if salvager ran automatically or not, but several volumes (same ones every time) would remain offline until manually salvaged. Transarc's response was something to the affect that they knew about the problem but hadn't fixed it. I've recently moved the server to a DECstation and a newer version of AFS and haven't seen it reccur (though it hasn't been very long since the switchover). -Mitch Collinsworth [EMAIL PROTECTED] --[1856]-- ------- End forwarded transaction
