Quoting Oliver Fromme <[EMAIL PROTECTED]>:

Chris H. <[EMAIL PROTECTED]> wrote:
> Thomas David Rivers wrote:
> > I have found that if I kill rpc.lockd on the NFS server,
> > most of the NFS issues I have (including a similar lock-up on
> > 6.1-RELEASE) go away.

FWIW, I also had problems with running rpc.lockd and
rpc.statd (no panics, though).  If you don't need them
(i.e. you don't need cross-machine locking), then don't
use them.  Use the -L flag to mount_nfs so at least
local locking works.

> You don't happen to have any experiences keeping rpc.statd
> running?

Basically, it doesn't make much sense to run one without
the other.  If you disable rpc.lockd, you can also safely
disable rpc.statd.

However, I don't think that your actual problem (lock-up
and panics) is related to rpc.lockd or rpc.statd.  It
rather sounds like something else is wrong with your
machine.  NFS works perfectly fine for me, including
copying huge files.

You wrote that you had a lot of crashes that accumulated
many files in lost+found.  Well, maybe your filesystem
was somehow damaged in the process.  It is possible to
damage file systems in a way that can lead to panics, and
it's not necessarily detected and repaired by fsck.

Indeed. I /too/ considered this. However, I largely dismissed this
as a possibility as most all of them are 0 length in size. The others
are fragments of logs. I'm not /completely/ ruling this out though.


> > > # cp /path/to/approx/10Mb/file /host/path/to/dest/dir/
> > >
> > > Fatal double fault
> > > eis 0x0blah
> > > eiblah blah0x
> > > panic double fault
> > > no dump device defined

You should try to setup a dump device, so you get a kernel
crash dump next time.  The crash dump can be used to find
out where the crash occured -- and I bet it's not in the
NFS code.

See the Handbook for details on how to setup a dump device.

By the way, does the problem also occur when copying the
file to/from a memory disk, so no physical disk is involved?
That way you would exclude the disk and the disk driver as
potential causes.  Similarly, try a loopback NFS mount
(i.e. mount from 127.0.0.1) in order to exclude the network
interface driver as a potential cause.

If the problem still exists when copying a 10 MB file from
a memory disk to a memory disk (same or other) via a
localhost mount on the same machine, then it looks like
the NFS code might be at fault.

Best regards
  Oliver

All good advise. I'm going to /initially/ take the easy way out
first (remove lockd/statd from rc.conf). As a quick experiment.
Then I'll endevour to investigate further using your suggestions.

Thank you very much for all your time and thoughtful answer.

--Chris



--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"C++ is the only current language making COBOL look good."
       -- Bertrand Meyer




--
panic: kernel trap (ignored)



-----------------------------------------------------------------
FreeBSD 5.4-RELEASE-p12 (SMP - 900x2) Tue Mar 7 19:37:23 PST 2006
/////////////////////////////////////////////////////////////////

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to