:Hello All,
:
:I am running a FreeBSD-4.2 NFS server with dozens of FreeBSD-4.2 NFS
:clients on 100BaseTX LAN. Recently I found that when the NFS server
:receives a lot of requests in a short period (e.g., 2 clients start X
:with gnome desktop simultaneously), all nfsd server processes hang in
:inode state.
:
: UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND
: 0 440 1 0 2 0 360 132 accept Is ?? 0:00.00 nfsd: master (nfsd)
: 0 441 440 0 -14 0 352 124 inode D ?? 0:03.49 nfsd: server (nfsd)
: 0 442 440 0 -14 0 352 124 inode D ?? 0:00.17 nfsd: server (nfsd)
: 0 443 440 0 -14 0 352 124 inode D ?? 0:00.02 nfsd: server (nfsd)
: 0 444 440 0 -14 0 352 124 inode D ?? 0:00.01 nfsd: server (nfsd)
:
:I cannot kill or restart them. The consoles of the clients print ``NFS
:server not responding'' and I should restart the server. This occurs
:about once a week.
:
:I tried
: (1) increasing the number of nfsd processes (4 -> 8, 20)
: (2) replacing the server HDD (SCSI) with another ATA33 HDD
: (3) changing mount_nfs options (tried removing tcp, adding soft,dumbtimer)
:but all failed to solve the problem.
It sounds like a deadlock somewhere, probably with some other process.
A full 'ps axlww' would be useful, and also a gdb backtrace of the
processes in question (including the 'other' process stuck in some
weird wait state if you can find it). You can gdb a live kernel
in a meaningful fashion if you have the kernel.debug image of the
kernel available somewhere.
gdb -k <location-of-kernel.debug-image> /dev/mem
proc 441
back
proc 442
back
proc 443
back
proc 444
back
proc <other-processes-stuck-in-weird-states>
back
-Matt
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message