control: severity -1 normal
control: tags -1 + moreinfo

Dear reporter,

I'm sorry to hear that you have lost data. However, it doesn't seem very
constructive to make a bug release-critical without providing enough detail
to make a fix possible. NFS is a complex network protocol, and the root cause
of unexpected behaviour isn't always obvious at first glance.

First of all, has this bug been filed against the right package? The nfsd
processes are actually kernel threads (that's one reason "kill -9" doesn't
work on them), the corresponding package is the kernel image.

How many clients are accessing that NFS server when the problem occurs?
I see that you have RPCNFSDCOUNT=8 but the address range for allowed
clients is a /27. If you have 30 clients all trying to write at the same
time, some of them are going to have to wait until a server thread becomes
available. "server not responding, still trying" is a common symptom of
this. Have you tried tuning the server? You can adjust the thread count
without a reboot.

I don't see sec=krb5p in your /etc/exports, so NFS traffic on the wire
is probably unencrypted. Have you looked at it with tcpdump or a similar
tool, particularly when the problem occurs? For example it would be nice
to know whether that "Connection timed out" you get from mount.nfs is for
the portmapper (unlikely), for mountd, or for nfsd itself. (strace may
also tell you some of this.)

Are you familiar with rpcdebug? If client traffic is coming in but the
server isn't replying, you could set debugging flags and look at kernel
log output.

Other available debugging tools include the kernel's event tracing
subsystem, as well as nfsstat, nfsiostat and mountstats from package
nfs-common. (The last two are client-side, so maybe not so useful
if your problem really is at the server end.)

I can't help you much more than this: my own environment is NFSv4-only
(and I feel no urge to look back) while yours is anything but. But if
you manage to pinpoint more precisely what's wrong, someone else may
be able to provide better hints (or you may figure it out yourself).

Reply via email to