[Bug 181996] Re: NFS server: lockd: server not responding

2010-01-10 Thread Arie Skliarouk
We use initrd.img-2.6.24-19-openvz with bunch of Linux clients without any 
problems.
Recently I tried to add Mac OS X client and immediately noticed that the 
nfs-kernel-server on Linux started locking up for several seconds (thus 
stalling NFS access for every other client) every minute with following message 
printed in the logs:
Jan 10 11:25:32 ubuntu1 kernel: [15421367.859941] rpcbind: server 
boaz-macbook.local not responding, timed out
Jan 10 11:25:32 ubuntu1 kernel: [15421367.859965] lockd: couldn't create RPC 
handle for boaz-macbook.local

I had to switch the MacOS X to use samba instead.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2009-09-03 Thread Guido Nickels
Hi!

We're experiencing the bug on hardy here, too:

- snip -
Sep  3 11:22:57 recovery1 kernel: [68409.731835] rpcbind: server 
s03.hallopizza.org not responding, timed out
Sep  3 11:22:57 recovery1 kernel: [68409.731876] lockd: server 
s03.hallopizza.org not responding, timed out
Sep  3 11:22:57 recovery1 kernel: [68409.731895] lockd: couldn't create RPC 
handle for s03.hallopizza.org
Sep  3 11:23:57 recovery1 kernel: [68469.578518] rpcbind: server 
s03.hallopizza.org not responding, timed out
Sep  3 11:23:57 recovery1 kernel: [68469.578559] lockd: server 
s03.hallopizza.org not responding, timed out
Sep  3 11:23:57 recovery1 kernel: [68469.578568] lockd: couldn't create RPC 
handle for s03.hallopizza.org
- snap -

Versions:
linux-image-2.6.24-24-generic 2.6.24-24.59
nfs-common 1:1.1.2-2ubuntu2.2
nfs-kernel-server 1:1.1.2-2ubuntu2.2

only reboot helps, but not for long - and we can't disable locking as
some customers depend on it.

Please tell me if I can help with debug information.

Cheers!

Guido

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2009-02-17 Thread Eckart Haug
(Storm)
I was using the nolock option since then - that means work with locking 
disabled,
wich - of course - worked.

On 10.02. I enabled locking again to give it a try. No problems since then.
Kernels are 2.6.24-23-generic on both client and server
nfs-kernel-server and nfs-common are 1:1.1.2-2ubuntu2.2

I still don't think it's a new problem - it just shows up in very special cases,
whch we don't know. Over here it disappeared as randomly as it appeared before
- and you still have it. When did it appear in your site ? Which changes did 
you make
before ?

If you post, have a look at https://wiki.ubuntu.com/KernelTeamBugPolicies
Over here, we're on our own now.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2009-02-09 Thread Storm
I have exactly the same problem on a Hardy server (Should I open a new bug 
report ?):
   * linux-image-server 2.6.24.23.25
   * nfs-kernel-server 1:1.1.2-2ubuntu2.2
   * nfs-common 1:1.1.2-2ubuntu2.2

If I reboot the server, it works for only a few minutes.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-10-12 Thread Eckart Haug
I tried the generic kernel (as opposed to server). Worked for a couple of days, 
then same again. 
Until about 4 weeks ago, the problem appeared sporadically, then almost every 
day - without
any change to the server (no automatic updates). It might depend on 
configuratin or certain 
packages on the client. My home resides on the server. Within the time under 
question I installed
virtual box on the client. It adds a script which adds a tap device (but 
doesn't activate a bridge).
Might also depend on my slow server hw (PIII-866/256MB).
I'm mounting nolock for the moment :-)), seems to work fine

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-10-08 Thread the.jxc
Can't agree with you there, Eckart.  I upgraded to Hardy and all my
problems with NFS disappeared.

[EMAIL PROTECTED]:~$ uname -a
Linux kirby 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686 GNU/Linux

...and hasn't hung once in months.  Used to hang at least once a day
under Gibbon.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-10-08 Thread Bart Swennen
Same here: do not agree with Eckart: we use the hardy kernel on an
otherwise Gutsy installation and the problem stays away.

When booting the Gutsy kernel, it promptly pops up again (within a day).

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-10-07 Thread Eckart Haug
Upgrading to hardy won't help, still the same;

client:
2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686 GNU/Linux

says:
Oct  7 10:54:15 lagaffe kernel: [ 3099.897267] lockd: server tide not 
responding, still trying
Oct  7 10:54:16 lagaffe kernel: [ 3101.624752] lockd: server tide not 
responding, still trying


server:
2.6.24-19-server #1 SMP Sat Jul 12 00:40:01 UTC 2008 i686 GNU/Linux

says:
Oct  7 10:56:15 tide kernel: [3364891.912872] lockd: server lagaffe not 
responding, timed out
Oct  7 10:56:15 tide kernel: [3364891.912939] lockd: couldn't create RPC handle 
for lagaffe
Oct  7 10:56:15 tide kernel: [3364891.913118] rpcbind: server lagaffe not 
responding, timed out

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-09-01 Thread Bart Swennen
I've come to the same conclusion as Vincent in
https://bugs.launchpad.net/ubuntu/+source/linux-
source-2.6.22/+bug/181996/comments/35 : the 2.6.22-15 kernel seems not
to have those patches applied ... any chance it will in the near future
?

I've looked at the sources in linux-source-2.6.22_2.6.22-15.58_all.deb

Upgrading to hardy is not (yet) an option, but we really would like to
use a `normal' Ubuntu-gutsy-kernel, which we cannot now because of this
bug.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-08-18 Thread Vincent
After getting the same problem last week (lockd: server ... not
responding, timed out on client; unkillable lockd on server) I had a
look at the source of the linux-image-2.6.22-15-generic package that
we're using. To my surprise, I couldn't confirm that the patches
mentioned in https://bugs.launchpad.net/ubuntu/+source/linux-
source-2.6.22/+bug/181996/comments/15 had been applied. Can anyone
comment on this?

Details:
$ dpkg -s linux-image-generic |grep ^Version
Version: 2.6.22.15.22
$ apt-get source linux-image-2.6.22-15-generic
[...]
$ less linux-source-2.6.22-2.6.22/fs/lockd/svclock.c

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-06-27 Thread Tim Gardner
Released in 2.6.22-14.53

** Changed in: linux-source-2.6.22 (Ubuntu)
   Status: Fix Committed = Fix Released

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-06-27 Thread Tim Gardner
Released in 2.6.22-14.53

** Changed in: linux-source-2.6.22 (Ubuntu Gutsy)
 Assignee: Tim Gardner (timg-tpi) = (unassigned)
   Status: Fix Committed = Fix Released

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-06-11 Thread Shang Wu
Any update on this? Has it been released yet??

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-05-28 Thread Tim Gardner
** Changed in: linux-source-2.6.22 (Ubuntu)
   Status: Triaged = Fix Committed

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-05-18 Thread Tim Gardner
There are a series of NFS patches pending on the SRU process. Any day
now...

** Changed in: linux-source-2.6.22 (Ubuntu Gutsy)
 Assignee: Colin King (colin-king) = Tim Gardner (timg-tpi)
   Status: Triaged = Fix Committed

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-05-16 Thread Colin King
** Changed in: linux-source-2.6.22 (Ubuntu Gutsy)
 Assignee: Ubuntu Kernel Team (ubuntu-kernel-team) = Colin King 
(colin-king)

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-05-09 Thread Russel Winder
I am now running Hardy with kernel 2.6.24-16-server and have not seen
this problem for 8 days now.  Is it the case that the kernel was patched
and this is a patched kernel?  If it is I am very happy and thankful to
those who did the debugging and the patching.  If not, then has the
problem been circumvented?

Thanks.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-05-09 Thread Jesper Krogh
Well.. since the problem only is present on a gutsy kernel.. it is quite
obvious that you can reproduce on the hardy kernel. The patch above is
from the patch-stream between gutsy and hardy.

Jesper

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-21 Thread JT
I'd just like to add that I have this problem too and thank all those
who have provided debugging info. This bug has been crippling my system
for some time, and confusing me greatly.

I would like to tentatively ask if there is any further progress with
adding the patch into a release update? I shall test the patches myself
to add another confirmed success with them (I hope) and report back.

I have to say I find it a little scary that this kernel version could go
out as a stable release with this bug in it. Do not many people use
NFS in ubuntu circles? I thought it would be considered an essential
service.

Thanks again for all your help.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-10 Thread Leann Ogasawara
Hi Jesper,

Thanks so much for testing and the feedback.  I've reopened the Gutsy
nomination and have reassigned to the kernel team.

For anyone wanting more information about the Stable Release Policy also
refer to:  https://wiki.ubuntu.com/StableReleaseUpdates .

Thanks again for the testing and the help.  We definitely appreciate
your patience and cooperation.

** Changed in: linux (Ubuntu Gutsy)
   Status: New = Invalid

** Changed in: linux-source-2.6.22 (Ubuntu Gutsy)
   Importance: Undecided = High
 Assignee: (unassigned) = Ubuntu Kernel Team (ubuntu-kernel-team)
   Status: New = Triaged
   Target: None = gutsy-updates

** Changed in: linux-source-2.6.22 (Ubuntu)
   Importance: Undecided = High
 Assignee: (unassigned) = Ubuntu Kernel Team (ubuntu-kernel-team)
   Status: Confirmed = Triaged
   Target: None = gutsy-updates

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-04 Thread Jesper Krogh
SRU is a StableReleaseUpdate .. thats described in the links above. The
process to get fixes pushed to a stable release.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-04 Thread Jesper Krogh
Changing to Confirmed..  as Described by Leann Ogasawara when the
patches are confirmed to work on a gutsy system.

** Changed in: linux-source-2.6.22 (Ubuntu)
   Status: Won't Fix = Confirmed

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-03 Thread Jesper Krogh
I can confirm that the above 2 patches solves the problem.

The problem is really grave.. making the NFS-server in gutsy rarely
usable. The locking problem occoured about every second day here..  I
applied the patch over a week ago and hasn't seen the problem since.

Jesper

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-03 Thread Jesper Krogh
Leann Ogasawara: Should we provide more to get a SRU for this bug in
gutsy?


Jesper

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-04-03 Thread the.jxc
What's an SRU?  I'd love to know more about the process for getting
fixes into Ubuntu.  Please explain!

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-03-06 Thread Leann Ogasawara
** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-03-06 Thread Leann Ogasawara
From the comments here it seems this is resolved for the Hardy kernel so
marking Fix Released against the Hardy 'linux' kernel source package.
The kernel stable release update policy if fairly strict:
https://wiki.ubuntu.com/KernelUpdates .  If someone could confirm the
two patches mentioned in comment
https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/181996/comments/15
resolve the issue for Gutsy the kernel team may take this into
consideration for an SRU.  Until then, against 2.6.22 this will be
closed.  Thanks.

** Changed in: linux (Ubuntu)
   Status: New = Fix Released

** Changed in: linux-source-2.6.22 (Ubuntu)
   Status: New = Won't Fix

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-02-28 Thread Russel Winder
I am running fully up to date Gutsy server and am getting what I think
is the same problem as is reported here.  After an indeterminate amount
of time and/or activity, the [lockd] process on the server goes from S
state to D state and all queries from clients result in messages such
as:

Feb 28 07:59:36 balin kernel: [73693.569139] lockd: server dimen not
responding, still trying

and hang forever.

I tried stopping and then starting nfs-common and nfs-kernel-server but
the [lockd] process remains and in state D.  Killing it explicitly has
no apparent effect.  A new [lockd] process appears in the process table
after the restart of nfs-kernel-server but it appears not to be used.

The only remedy appears to be to reboot the server and then it seems all
the clients.

It seems that the solution to the problem may now be known, so I guess
the question is when will an update to the Gutsy kernel be issued?  I
guess it goes without saying that it would be good if the kernel issued
with Hardy does not have this problem?

Thanks.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 181996] Re: NFS server: lockd: server not responding

2008-02-28 Thread the.jxc
Yeah,

I know how to fix the problem, but I have no idea how to get a patch 
into Gutsy.  Any ideas who I would contact?

J.

Russel Winder wrote:
 I am running fully up to date Gutsy server and am getting what I think
 is the same problem as is reported here.  After an indeterminate amount
 of time and/or activity, the [lockd] process on the server goes from S
 state to D state and all queries from clients result in messages such
 as:
 
 Feb 28 07:59:36 balin kernel: [73693.569139] lockd: server dimen not
 responding, still trying
 
 and hang forever.
 
 I tried stopping and then starting nfs-common and nfs-kernel-server but
 the [lockd] process remains and in state D.  Killing it explicitly has
 no apparent effect.  A new [lockd] process appears in the process table
 after the restart of nfs-kernel-server but it appears not to be used.
 
 The only remedy appears to be to reboot the server and then it seems all
 the clients.
 
 It seems that the solution to the problem may now be known, so I guess
 the question is when will an update to the Gutsy kernel be issued?  I
 guess it goes without saying that it would be good if the kernel issued
 with Hardy does not have this problem?
 
 Thanks.


-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-02-28 Thread Russel Winder
I would have thought that the Ubuntu Kernel Team would have looked at
this problem -- especially as there is a putative fix.  However, it
seems it may not yet have even been triaged by them.  The problem, at
least as I see it, is that there is no regularity to the failure.  This
must make it hard to actively work on.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 181996] Re: NFS server: lockd: server not responding

2008-02-28 Thread the.jxc
No,

The failure is very regular.  It happens whenever the garbage 
collection is performed as a result of a lock request.

J.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-02-28 Thread Brett Sealey
I've been seeing it for a while now, but only when I run an application
on the nfs client that intensively uses file locking.

The only fix is to reboot the server.

When it occurs, the following hangs on the client(in the flock):
   time flock ~/junk echo ok; rm ~/junk

[note: flock is in the util-linux package]

A fix in Gutsy seems simple and would be very nice.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-02-11 Thread Ben Beuchler
Any progress on a patch?  I'm running into the same problem.  If not,
would you mind providing a bit more info describing the necessary steps
to get the 2.6.24 kernel installed on a Gutsy server?

Thanks...

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-02-11 Thread the.jxc
Second part is easy.  Fix yaird as above, download the .deb files, and
dpkg install them both.  I had no hassles with that.

J. Bruce Fields suggested the following two patches, but I didn't use
those.

http://git.linux-nfs.org/?p=trondmy/nfs-2.6.git;a=commitdiff;h=255129d1e9ca0ed3d69d5517fae3e03d7ab4b806

http://git.linux-nfs.org/?p=trondmy/nfs-2.6.git;a=commitdiff;h=a6d85430424d44e946e0946bfaad607115510989

...I just downloaded the ubuntu source for the kernel I had, and
manually patched the lockd driver.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-31 Thread the.jxc
This is fixed in the 2.6.24 kernel series.

I installed:

linux-image-2.6.24-5-generic_2.6.24-5.8_i386.deb
linux-ubuntu-modules-2.6.24-5-generic_2.6.24-5.9_i386.deb

from:

http://packages.ubuntu.com/hardy/base/

(After making the changes to yaird required to install it)
vi /usr/lib/yaird/perl/Input.pm
--- Input.pm.orig   2007-10-22 18:29:27.0 +0200
+++ Input.pm2007-12-11 15:39:52.0 +0100
@@ -54,6 +54,11 @@
my $devLink = Conf::get('sysFs')
. /class/input/$handler/device;
my $hw = readlink ($devLink);
+   if (defined ($hw)  $hw =~ 
s!^(\.\./)+(class/input/input\d+)$!$2!) {
+   # Linux 2.6.23 eventX - inputX link
+   $devLink = Conf::get('sysFs') . '/' . $hw . '/device';
+   $hw = readlink ($devLink);
+   }
if (defined ($hw)) {
unless ($hw =~ s!^(\.\./)+devices/!!) {
# imagine localised linux (/sys/geraete ...)

...it all works fine.  I'll try and track down the patchset required to
fix the Gibbon kernel.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
OK, I added more debug to /usr/src/linux/fs/lockd/host.c and installed a
new lockd module.  Seems like it's getting lost somewhere in
nlmsvc_mark_resources().  I'll keep digging.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
It's getting lost in nlm_inspect_file ().

[  693.679373] lockd: mutex acquired, checking 128 file hash entries
[  693.679375] lockd: got entry in list 58
[  693.679376] lockd: inspecting file

dprintk(lockd: mutex acquired, checking %d file hash entries\n, 
FILE_NRHASH);
for (i = 0; i  FILE_NRHASH; i++) {
hlist_for_each_entry_safe(file, pos, next, nlm_files[i], 
f_list) {
dprintk(lockd: got entry in list %d\n, i);
file-f_count++;
mutex_unlock(nlm_file_mutex);

/* Traverse locks, blocks and shares of this file
 * and update file-f_locks count */
dprintk(lockd: inspecting file\n);
if (nlm_inspect_file(host, file, match))
ret = 1;

dprintk(lockd: inspection complete\n);

...it never returns from nlm_inspect_file (...).

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
Right, this appears (unsurprisingly) to be a mutex contention issue, on
the file-specific mutex.

See the attached trace: server-kirby-v3.dmsg

Key parts are:

[ 5845.725268] lockd: request from 192.168.1.210, port=860
[ 5845.725272] lockd: LOCK  called
[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
[ 5845.725507] lockd: nsm_monitor(romita)
[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 5845.725789] lockd: found file f7aa2840 (count 0)
[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 
1073741826-1073742335, bl=0)
[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 
1073741826-1073742335 ty=0
[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 5845.726186] lockd: host garbage collection
[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
[ 5845.726483] lockd: unlocking file mutex
[ 5845.726485] lockd: traverse shares
[ 5845.726486] lockd: traverse locks
[ 5845.726488] lockd: inspection complete
[ 5845.726625] lockd: check file for release
...
(Same pattern repeated for several other files)
...
[ 5845.728644] lockd: got entry in list 58
[ 5845.728645] lockd: inspecting file f=f7aa2840
[ 5845.728646] lockd: traverse blocks
[ 5845.728648] lockd: locking file mutex
...
The final debug is from nlmsvc_traverse_blocks() in 
/usr/src/linux/fs/lockd/svclock.c

dprintk (lockd: locking file mutex\n);
mutex_lock(file-f_mutex);
list_for_each_entry_safe(block, next, file-f_blocks, b_flist) {
dprintk (lockd: trying block for host %p\n, host)
...
}
dprintk (lockd: unlocking file mutex\n);

And it's clear now that we're calling the mutex_lock and never leaving.

The important note is that all the previous file checks worked.  Why is there a 
mutex 
already taken on only this file?  Well, note that this is the file from the 
request that 
actually triggered the GC.  Presumably there's a mutex taken out for this file, 
then
we run the GC, and we attempt to re-take out the mutex.  I'll trawl the code 
and 
confirm this.

If so, the fix is probably to move the call to the GC so that it's outside the 
handling
for the actual RPC call.  In fact, the mutex isn't strictly required for the GC 
because 
in this case we're only counting host references.  But it looks like we're 
doing our 
reference count by piggybacking on some other code which actually does sweeps 
of 
file locks, so we can't just remove the mutexes.

** Attachment added: server-kirby-v3.dmsg
   http://launchpadlibrarian.net/11456002/server-kirby-v3.dmsg

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
Hmm... one quick-fix approach would seem to be to pass LOCK_RECURSIVE to
mutex_init.

On that note, it's not clear to me yet why we're even using mutexes
here.  Isn't there only a single lockd process?  And in that case, all
these mutexes are private, no?  Or is it possible to start two lockd's
for higher performance (not something I've ever done).

Alternatively, create a new function:

/*
 * Check to see if it's time to sweep the garbage out of the hosts structures.
 */
static void
nlm_gc_hosts_if_needed(void)

if (time_after_eq(jiffies, next_gc))
nlm_gc_hosts();
}

...remove the corresponding code from nlm_lookup_host (...), and invoke
nlm_gc_hosts_if_needed from somewhere outside the file-specific mutex
code.  Maybe in the lockd main loop, after each call to svc_process
(...).

I think I'll try that with my code.  I'm just a bit worried about
performance impact of making all file mutexes recursive.  Surely a
recursive mutex has to be a bit of a hit compared to the vanilla
version?

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
OK, let's follow the request and see what is performing a file mutex
lock.

[ 5845.725268] lockd: request from 192.168.1.210, port=860
This is from the main lockd kernel thread function.
static void lockd (...) in svc.c.  It invokes svc_process().

[ 5845.725272] lockd: LOCK called
Via some xdr magic, preprocessor, and function lookup table, our main handler 
function
nlmsvc_proc_lock (...) from svcproc.c is called.

[ 5845.725274] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 5845.725504] lockd: get host romita
[ 5845.725506] lockd: found host in cache
nlmsvc_proc_lock (...) invokes nlmsvc_retrieve_args (...) also in svcproc.c to 
get/parse 
some args, including the host.  In this case, the host is found in the cache.  

[ 5845.725507] lockd: nsm_monitor(romita)
nlmsvc_retrieve_args (...) also monitors the host in some way that isn't clear 
to me yet.  
It doesn't appear to be related to our problem, so that can be put aside for 
now.

[ 5845.725509] lockd: nlm_file_lookup (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 0028c75d)
nlmsvc_retrieve_args (...) also does a file lookup by calling 
nlm_lookup_file(...)  This debug
is from nlm_lookup_file (even though it says nlm_file_lookup).  We take out the 
file table
mutex here, but not the file-specific mutex.  We initialise the file mutex 
here, so from this
point onwards we need to be looking out for file specific locks.
[ 5845.725789] lockd: found file f7aa2840 (count 0)

[ 5845.725792] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=80357, 
1073741826-1073742335, bl=0)
Now nlmsvc_proc_lock (...) calls nlmsvc_lock (...) from svclock.c to do the 
actual locking.  Very
first thing, right after the debug, this takes out the mutex on the file...

/* Lock file against concurrent access */
mutex_lock(file-f_mutex);

The corresponding...
mutex_unlock(file-f_mutex);
...is right down the bottom of nlmsvc_lock (...)...  
out:
mutex_unlock(file-f_mutex);
nlmsvc_release_block(block);
dprintk(lockd: nlmsvc_lock returned %u\n, ret);

...but we don't get that far.  I think we've found it then.  But let's
carry on...

[ 5845.725806] lockd: nlmsvc_lookup_block f=f7aa2840 pd=80357 
1073741826-1073742335 ty=0
The call from nlmsvc_lock (...) to nlmsvc_lookup_block (...) is right after the 
file-specific 
mutex lock is taken out.We don't find an existing block, so nlmsvc_lock 
(...)  creates a 
new one by calling nlmsvc_create_block (...).

[ 5845.725809] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
nlmsvc_create_block (...) calls nlmsvc_lookup_host (...) ...

[ 5845.726186] lockd: host garbage collection
...which decides it's time to take out the trash.

[ 5845.726188] lockd: nlmsvc_mark_resources
[ 5845.726189] lockd: nlm_traverse_files
[ 5845.726255] lockd: mutex acquired, checking 128 file hash entries
[ 5845.726257] lockd: got entry in list 29
[ 5845.726259] lockd: inspecting file f=f7afd480
[ 5845.726260] lockd: traverse blocks
[ 5845.726262] lockd: locking file mutex
...which goes through all the files fine, until it comes to the specific file 
for which we are currently
serving the request... and hey presto.  Two mutexes are not better than one.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-20 Thread the.jxc
I went with the nlm_gc_hosts_if_needed () approach.  Stable so far.
Debug shows completion of GC.

[ 6879.405447] lockd: request from 192.168.1.211, port=729
[ 6879.405454] lockd: LOCK  called
[ 6879.405458] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, 
name=ditko)
[ 6879.405460] lockd: get host ditko
[ 6879.405461] lockd: found host in cache
[ 6879.405463] lockd: nsm_monitor(ditko)
[ 6879.405466] lockd: nlm_file_lookup (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405470] lockd: creating file for (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 002a0bca)
[ 6879.405477] lockd: found file f7a3fcc0 (count 0)
[ 6879.405481] lockd: nlmsvc_lock(sda1/2755530, ty=1, pi=95, 
0-9223372036854775807, bl=0)
[ 6879.405484] lockd: nlmsvc_lookup_block f=f7a3fcc0 pd=95 
0-9223372036854775807 ty=1
[ 6879.405487] lockd: nlm_lookup_host(192.168.1.211, p=6, v=4, my role=server, 
name=ditko)
[ 6879.405488] lockd: get host ditko
[ 6879.405489] lockd: found host in cache
[ 6879.405492] lockd: created block ef70db80...
[ 6879.405495] lockd: vfs_lock_file returned 0
[ 6879.405497] lockd: freeing block ef70db80...
[ 6879.405498] lockd: release host ditko
[ 6879.405500] lockd: nlm_release_file(f7a3fcc0, ct = 2)
[ 6879.405502] lockd: nlmsvc_lock returned 0
[ 6879.405503] lockd: LOCK  status 0
[ 6879.405504] lockd: release host ditko
[ 6879.405506] lockd: nlm_release_file(f7a3fcc0, ct = 1)
[ 6879.405512] lockd: host garbage collection
[ 6879.405513] lockd: nlmsvc_mark_resources
[ 6879.405515] lockd: nlm_traverse_files
[ 6879.405516] lockd: mutex acquired, checking 128 file hash entries
[ 6879.405519] lockd: got entry in list 109
[ 6879.405520] lockd: inspecting file f=f7a3fcc0
[ 6879.405521] lockd: traverse blocks
[ 6879.405525] lockd: locking file mutex
[ 6879.405526] lockd: unlocking file mutex
[ 6879.405527] lockd: traverse shares
[ 6879.405528] lockd: traverse locks
[ 6879.405530] lockd: inspection complete
[ 6879.405531] lockd: check file for release
[ 6879.405532] lockd: nlm_traverse_files finally releasing mutex
[ 6879.405533] lockd: nlm_traverse_files completed
[ 6879.405535] lockd: now removing inactive hostsnlm_gc_hosts skipping romita 
(cnt 0 use 0 exp 1672246)
[ 6879.405538] nlm_gc_hosts skipping ditko (cnt 0 use 1 exp 1672627)
[ 6879.405540] lockd: completed host garbage collection, next at (1642627 + 
15000 = 1657627)
[ 6879.406106] lockd: request from 192.168.1.211, port=729
...

I'm missing a \n in a dprintk.  Otherwise looks sweet.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-19 Thread the.jxc
OK, I turned on debug with:

echo 65535  /proc/sys/sunrpc/nlm_debug

There seems to be a problem when lockd enters garbage collection.
Here's the last of the debug seen from lockd on the server side.

[ 2277.091005] lockd: request from 192.168.1.210, port=864
[ 2277.091018] lockd: LOCK  called
[ 2277.091022] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 2277.091026] lockd: get host romita
[ 2277.091027] lockd: nsm_monitor(romita)
[ 2277.091031] lockd: nlm_file_lookup (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091035] lockd: creating file for (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091047] lockd: found file f7be3900 (count 0)
[ 2277.091050] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 
1073741824-1073741824, bl=0)
[ 2277.091054] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 
1073741824-1073741824 ty=0
[ 2277.091056] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 2277.091058] lockd: get host romita
[ 2277.091062] lockd: created block ecbfe6c0...
[ 2277.091066] lockd: vfs_lock_file returned 0
[ 2277.091068] lockd: freeing block ecbfe6c0...
[ 2277.091069] lockd: release host romita
[ 2277.091071] lockd: nlm_release_file(f7be3900, ct = 2)
[ 2277.091073] lockd: nlmsvc_lock returned 0
[ 2277.091075] lockd: LOCK  status 0
[ 2277.091076] lockd: release host romita
[ 2277.091078] lockd: nlm_release_file(f7be3900, ct = 1)

[ 2277.091298] lockd: request from 192.168.1.210, port=864
[ 2277.091302] lockd: LOCK  called
[ 2277.091304] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 2277.091306] lockd: get host romita
[ 2277.091307] lockd: nsm_monitor(romita)
[ 2277.091310] lockd: nlm_file_lookup (01070001 00288001  926e57da 
d142d9c6 dabb48bd c2a30bcf 0028c75d)
[ 2277.091316] lockd: found file f7be3900 (count 0)
[ 2277.091319] lockd: nlmsvc_lock(sda1/2672477, ty=0, pi=58832, 
1073741826-1073742335, bl=0)
[ 2277.091322] lockd: nlmsvc_lookup_block f=f7be3900 pd=58832 
1073741826-1073742335 ty=0
[ 2277.091325] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 2277.091327] lockd: host garbage collection
[ 2277.091328] lockd: nlmsvc_mark_resources

Nothing more is seen from the lockd after the start of the GC.  Looking
at earlier GC runs from the syslog, the pattern is:

[ 2037.388911] lockd: nlm_lookup_host(192.168.1.210, p=6, v=4, my role=server, 
name=romita)
[ 2037.388914] lockd: host garbage collection
[ 2037.388916] lockd: nlmsvc_mark_resources
[ 2037.388920] nlm_gc_hosts skipping romita (cnt 0 use 0 exp 455264)
[ 2037.388922] nlm_gc_hosts skipping ditko (cnt 0 use 0 exp 460016)
[ 2037.388924] lockd: get host romita

So it finds a couple of entries (skips 'em) and then breaks out to carry
on immediately with get host.  I'm assuming that GC is invoked as part
of lookup handling, and doesn't just get triggered asynchronously.

Anyhow, this looks like a good spot to start digging.  I don't see
anything running on top (does lockd show on top?)  But the process still
seems to be in the ps table.  It just doesn't do  anything any more.

[EMAIL PROTECTED]:/boot/grub# ps -ef|grep lockd
root  4715 2  0 14:19 ?00:00:00 [lockd]

[EMAIL PROTECTED]:/boot/grub# top
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

[EMAIL PROTECTED]:/boot/grub# uname -a
Linux kirby 2.6.22-14-generic #1 SMP Tue Dec 18 08:02:57 UTC 2007 i686 GNU/Linux

Machine is a dual-core Intel on a Shuttle board.  Hard disk is SATA.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-19 Thread the.jxc

** Attachment added: Output from server's dmsg.
   http://launchpadlibrarian.net/11447297/server-kirby.dmsg

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-19 Thread the.jxc

** Attachment added: Output from client's dmesg.
   http://launchpadlibrarian.net/11447299/client-romita.dmsg

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-17 Thread the.jxc
I can confirm the same on a brand new Gutsy install with 2.6.22-14.
Within 24 hours usually, this will occur.  When it happens, Skype,
Amarok and other apps will hang first, but eventually all apps will
hang.  Client reboot does nothing.  nfs-kernel-server doesn't help (it
doesn't clear the broken lockd, see below).  Only a daily server reboot
will resolve anything.

On the client side (also 2.6.22-14) I see:
syslog.0:Jan 17 23:23:21 romita kernel: [28308.368819] lockd: server kirby not 
responding, still trying

On the server side, if I restart nfs-kernel-server, I see:
Jan 18 08:55:37 kirby kernel: [62797.376546] lockd_down: lockd failed to exit, 
clearing pid

...and on the server side I will now see TWO [lockd] processes where
before I saw one.

I don't have a 2.6.20 kernel to go back to on my new server.  This is
basically making my server totally unusable.  I'm looking at having to
drop nfs and use samba instead.  GACK!

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-17 Thread the.jxc
Feel free to contact me if I can offer help debugging this.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-16 Thread Denis Sidorov
Since I have downgraded kernel to 2.6.20 (a week ago), the error does not show 
up anymore.
It appears to be a bug in kernel, 'cause I used to find a similar issue 
reported for Fedora Core 7, running 2.6.22.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 181996] Re: NFS server: lockd: server not responding

2008-01-11 Thread Denis Sidorov
** Description changed:

- Running NFS server on Ubuntu Server 7.10 on x86 machine (Pentium-3, 1G RAM).
+ Running NFS server on Ubuntu Server 7.10 x86 (Pentium-3, 1G RAM).
  - linux-image-2.6.22-14-server (2.6.22-14.47)
  - nfs-kernel-server (1:1.1.1~git-20070709-3ubuntu1)
  - nfs-common (1:1.1.1~git-20070709-3ubuntu1)
  
  NFS clients (ubuntu, gentoo, fedora core) mount home directories from the 
server.
  Works fine for a while after reboot, but at some moment (30 minutes to 
several days after last reboot) client applications (firefox, thunderbird, 
openoffice, ...) would freeze at start and the following error message can be 
seen in the syslog:
  
  Jan 11 14:08:33 jig kernel: [ 5527.793749] lockd: server tango not 
responding, still trying
  Jan 11 14:08:34 jig kernel: [ 5529.029039] lockd: server tango not 
responding, still trying
  Jan 11 14:08:45 jig kernel: [ 5540.246812] lockd: server tango not 
responding, still trying
  
  The nfsd, rpc.statd, rpc.mountd processes keep running on server. No
- relevant errors can be found in server syslog neither.
+ relevant errors can be found in server syslog.
  
  Restarting the nfs-kernel-server (on server) and nfs-common (on both
  server and client) would not help - the problem persists.
  
  Have also tried nfs-user-server instead of nfs-kernel-server - no luck.
  
  The only way to make it work is to reboot the server.

-- 
NFS server: lockd: server not responding
https://bugs.launchpad.net/bugs/181996
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs