Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread J. Bruce Fields
On Fri, Sep 21, 2007 at 07:46:32PM +0100, Nix wrote: > On 18 Sep 2007, J. Bruce Fields told this: > > Also I suppose we should check which version of nfs-utils that fix is in > > and make sure distributions are getting the fixed nfs-utils before they > > get the new libc, or we're going to see

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread Nix
On 18 Sep 2007, J. Bruce Fields told this: > Also I suppose we should check which version of nfs-utils that fix is in > and make sure distributions are getting the fixed nfs-utils before they > get the new libc, or we're going to see this bug a lot Further info. This behaviour, although it is

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread Nix
On 18 Sep 2007, J. Bruce Fields told this: Also I suppose we should check which version of nfs-utils that fix is in and make sure distributions are getting the fixed nfs-utils before they get the new libc, or we're going to see this bug a lot Further info. This behaviour, although it is

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread J. Bruce Fields
On Fri, Sep 21, 2007 at 07:46:32PM +0100, Nix wrote: On 18 Sep 2007, J. Bruce Fields told this: Also I suppose we should check which version of nfs-utils that fix is in and make sure distributions are getting the fixed nfs-utils before they get the new libc, or we're going to see this bug a

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-18 Thread Nix
On 18 Sep 2007, J. Bruce Fields stated: > On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote: >> The code which calls new_do_write() looks like this: >> >> ,[ libio/fileops.c:_IO_new_file_xsputn() ] >> | if (do_write) >> |{ >> | count = new_do_write (f, s, do_write); >> |

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-18 Thread Nix
On 18 Sep 2007, J. Bruce Fields stated: On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote: The code which calls new_do_write() looks like this: ,[ libio/fileops.c:_IO_new_file_xsputn() ] | if (do_write) |{ | count = new_do_write (f, s, do_write); | to_do -= count; |

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote: > The code which calls new_do_write() looks like this: > > ,[ libio/fileops.c:_IO_new_file_xsputn() ] > | if (do_write) > |{ > | count = new_do_write (f, s, do_write); > | to_do -= count; > | if (count < do_write) > |

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
On 17 Sep 2007, J. Bruce Fields stated: > On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote: >> A while later we start seeing runs of malloc failures, which I think >> correlated with the unexplained pauses in NFS response: > > Actually, they're nothing to do with malloc failures--the message >

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote: > Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4 > Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3) 36: 01070001 000fb001 > d32ff38f 404811a6 a88d96ab 0x1f > Sep 17 22:57:55 loki warning: kernel: nfsd:

[2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
Back in early 2006 I reported persistent hangs on the NFS server, whereby all of a sudden about ten minutes after boot my primary NFS server would cease responding to NFS requests until it was rebooted. That time, the problem

[2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
Back in early 2006 I reported persistent hangs on the NFS server, whereby all of a sudden about ten minutes after boot my primary NFS server would cease responding to NFS requests until it was rebooted. http://www.ussg.iu.edu/hypermail/linux/kernel/0601.3/1631.html That time, the problem vanished

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote: Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4 Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3) 36: 01070001 000fb001 d32ff38f 404811a6 a88d96ab 0x1f Sep 17 22:57:55 loki warning: kernel: nfsd:

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
On 17 Sep 2007, J. Bruce Fields stated: On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote: A while later we start seeing runs of malloc failures, which I think correlated with the unexplained pauses in NFS response: Actually, they're nothing to do with malloc failures--the message printed

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote: The code which calls new_do_write() looks like this: ,[ libio/fileops.c:_IO_new_file_xsputn() ] | if (do_write) |{ | count = new_do_write (f, s, do_write); | to_do -= count; | if (count do_write) |