On Sat, Jul 23, 2005 at 11:00:37AM -0400, Robert Banz wrote: > I upgraded two (pretty busy) fileservers from 1.3.84 to 1.3.85 last > Sunday. Everthing seemed to be working right, however, last night both > of them got into the meltdown syndrome where they 'busy' all requests > causing much badness to clients that were using them. > > The platform is Solaris 10 amd64, up to current patch. > > Unfortunatly, I can't provide much debugging information on this -- it > happened at 2am, so I wasn't quite in the mental state for "collecting > information". No out-of-the ordinary messages were in the fileserver or > volserver logs; the only 'out of the ordinary' event that was occuring > at the time is that it was well in the middle of our backup window. > From what i could tell, .backup snapshot creation had finished about 20 > minutes before things started to go bad, and it looks like > dumping-to-tape had begun. Could there be any open fileserver/volserver > IPC issues? >
We had the same problem with i386 Linux 2.6.12 and OpenAFS 1.3.84. An AFS backup ran on the *.backup volumes, the fileserver was very busy during this time, and then for about 30 to 45 minutes no access to any files on this fileserver from any client was possible. But there were no crashes of the fileserver/volserver and there were no entries in the logfiles. We had this problem two times, and during this time the AFS backup was writing the *.backup volumes to tape. (Normally this is done during the night, but due to a hardware problem with the tape library, we started the backup by hand within the working hours.) Hans-Werner -- Hans-Werner Paulsen [EMAIL PROTECTED] MPI für Astrophysik Tel 089-30000-2602 Karl-Schwarzschild-Str. 1 Fax 089-30000-2235 D-85741 Garching _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
