We run afsd with the following parameters: -files 30000 -dcache 10000 -stat 8000 -daemons 5 -volumes 256
As for the blocked request - I agree that new requests might be blocked, but I see in this case that AFS access completely hangs, and it doesn't resume. Another thing - other commands, that doesn't reside in AFS, like strace, ps, etc hang after AFS becomes "frozen". And, there is no such problem running IBM AFS 3.6 patch 4 in the same configuration. -- Gregory -----Original Message----- From: Derek Atkins [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 05, 2002 12:07 AM To: Touretsky, Gregory Cc: '[EMAIL PROTECTED]'; Broughton, Travis V; Ervin, Douglas C; Shamir, Yuval Subject: Re: [OpenAFS-devel] OpenAFS 1.2.3 client hangs on Linux - kernels 2.4.2 and 2.4.9 There are multiple issues going on here. The AFS client has a finite number of channels that it can use to contact servers, and a finite number of callback worker threads that can work on requests. If all the background daemons are busy, then future requests will block until one becomes available. What do you have for your -daemons setting to afsd? You might try increasing that number. -derek "Touretsky, Gregory" <[EMAIL PROTECTED]> writes: > Hi, > > configuring Linux machine as NIS server, we found a strange problem - AFS > hangs if there are several (4) instances of "pwck -r" running > simultaneously. pwck verifies integrity of /etc/passwd, and it stat's all > user home dirs. We have ~3000 unix accounts with home dirs in AFS (each home > directory is a volume). > I succeeded to reproduce this problem running several instances (10+) of the > following short script: > #!/bin/tcsh > #Usage <command> <file with the long list of AFS mount points> > foreach i (`cat $1`) > /bin/ls -ld $i > end > > The problem is reproducible on Linux 2.4.2 and 2.4.9 kernels with OAFS > 1.2.3, I couldn't reproduce it on 2.4.2 with IBM AFS 3.6 patch 4. > Here are the last lines from fstrace output: > time 206.474904, pid 1262: Access vp 0xe0aa0000 mode 0x40 len 0x800 > time 206.474904, pid 1262: Access vp 0xe0aa0000 mode 0x40 len 0x800 > time 206.474904, pid 1262: Access vp 0xe0aa05b8 mode 0x40 len 0x1000 > time 206.474904, pid 1262: Access vp 0xe0aa05b8 mode 0x40 len 0x1000 > time 206.474904, pid 1262: Access vp 0xe0ad33b0 mode 0x40 len 0x800 > time 206.474904, pid 1262: Access vp 0xe0ad3598 mode 0x40 len 0x800 > time 206.484904, pid 1256: Analyze RPC op -1 conn 0xd3f5e6c0 code 0x0 user > 0x0 > time 206.484904, pid 1256: Mount point is to vp 0xe0bb8938 fid > (1:537094601.42.831) > time 206.504904, pid 1265: Access vp 0xe0aa0000 mode 0x40 len 0x800 > time 206.504904, pid 1265: Access vp 0xe0aa0000 mode 0x40 len 0 > > You can see that the last line is incomplete. > > Any thoughts? > > Gregory Touretsky > Israel Engineering Computing > Unix Server Platforms > [EMAIL PROTECTED] > > (+) 972-4-865-6377, Fax: 04-865-5999 > iNET: 465-6377, M/S: IDC-1B > > > _______________________________________________ > OpenAFS-devel mailing list > [EMAIL PROTECTED] > https://lists.openafs.org/mailman/listinfo/openafs-devel -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH [EMAIL PROTECTED] PGP key available _______________________________________________ OpenAFS-devel mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-devel
