On Tue, Feb 08, 2005 at 05:28:10PM -0500, Jeffrey Hutzelman wrote: > If you are noticing actual problems, rather than just messages in your > FileLog, please describe what they are.
Sorry, it was a bit late yesterday. I had found out that I had not described any problem myself... The problem is an apparently hanging file server. The 1.2.11 on Linux 2.4.21 clients sends FetchStatus requests on behalf of different uid's in high volume. RX challenge/response proceeds, as do ptserver queries by the file server. The file server also sends the WhoAreYou requests back to the client which are normally replied to very quickly. The routine in afs_callback.c does not have to work particularly hard. But after some requests have been successfully processed, some WhoAreYou call seems stuck in the client. The server is forced to re-send the query for in my case about 90 seconds or even longer, the client sends a RX-level ack, but no WhoAreYou replies. Meanwhile other FetchStatus requests pile up, all obviously waiting in a queue to get their chance to issue the WhoAreYou requests. During this state, a lot of RX Ack stuff is going on, but nothing else. No CPU on either the client or server side, no disk activity, and no network traffic that might remotely create a bottleneck. When the server finally gives up, normal processing resumes. I created a version of afs_callback.c without the ObtainReadLock by hard-coding the information requested, but this did not help, the client seems stuck in another lock. Not asking for the WhoAreYou at all obviously helped, but this can at best be called a hack. Volker
pgpkTi37Cp0gy.pgp
Description: PGP signature
