Hi.
I'm currently trying to optimize our NFS server. We're running in a
cluster setup with a single NFS server and some compute nodes pulling data
from it. Currently the dataset is less than 10GB so it fits in memory of
the NFS-server. (confirmed via vmstat 1).
Currently I'm getting around
On Tue, 5 Feb 2008 23:35:48 -0500
Christoph Hellwig [EMAIL PROTECTED] wrote:
On Tue, Feb 05, 2008 at 02:37:57PM -0500, Jeff Layton wrote:
Because kthread_stop blocks until the kthread actually goes down,
we have to send the signal before calling it. This means that there
is a very small
Hi,
On 02/06/2008 11:04:34 AM +0100, Jesper Krogh [EMAIL PROTECTED] wrote:
Hi.
I'm currently trying to optimize our NFS server. We're running in a
cluster setup with a single NFS server and some compute nodes pulling data
from it. Currently the dataset is less than 10GB so it fits in memory of
On Wed, 2008-02-06 at 19:24 +1300, Andrew Dixie wrote:
The fact that the delegreturn call appears to have hit xprt_timer is
interesting. Under normal circumstances, timeouts should never occur
under NFSv4. Could you tell us what mount options you're using here?
Also please could you
On Wed, 2008-02-06 at 10:07 -0500, J. Bruce Fields wrote:
That went into 2.6.22:
21315edd4877b593d5bf.. [PATCH] knfsd: nfsd4: demote clientid
in use printk to a dprintk
It may suggest a problem if this is happening a lot, though, right?
The client should always be able to
On Wed, 2008-02-06 at 15:37 +0100, Gabriel Barazer wrote:
Should I go for NFSv2 (default if I dont change mount options) NFSv3 ? or
NFSv4
NFSv2/3 have nearly the same performance
Only if you shoot yourself in the foot by setting the 'async' flag
in /etc/exports. Don't do that...
Most
Hi,
I'm currently trying to optimize our NFS server. We're running in a
cluster setup with a single NFS server and some compute nodes pulling
data from it. Currently the dataset is less than 10GB so it fits in
memory of the NFS-server. (confirmed via vmstat 1). Currently I'm
getting around
It's currently possible for an unresponsive NLM client to completely
lock up a server's lockd. The scenario is something like this:
1) client1 (or a process on the server) takes a lock on a file
2) client2 tries to take a blocking lock on the same file and
awaits the callback
3) client2 goes
This patchset fixes the problem that Bruce pointed out last week when
we were discussing the lockd-kthread patches.
The main problem is described in patch #1 and that patch also fixes the
DoS. The remaining patches clean up how GRANT_MSG callbacks handle an
unresponsive client. The goal in those
With the current scheme in nlmsvc_grant_blocked, we can end up with more
than one GRANT_MSG callback for a block in flight. Right now, we requeue
the block unconditionally so that a GRANT_MSG callback is done again in
30s. If the client is unresponsive, it can take more than 30s for the
call
It's possible for lockd to catch a SIGKILL while a GRANT_MSG callback
is in flight. If this happens we don't want lockd to insert the block
back into the nlm_blocked list.
This helps that situation, but there's still a possible race. Fixing
that will mean adding real locking for nlm_blocked.
On Wed, Feb 06, 2008 at 10:15:23AM -0500, Trond Myklebust wrote:
On Wed, 2008-02-06 at 10:07 -0500, J. Bruce Fields wrote:
That went into 2.6.22:
21315edd4877b593d5bf.. [PATCH] knfsd: nfsd4: demote clientid
in use printk to a dprintk
It may suggest a problem if this is
On Wed, 2008-02-06 at 12:23 -0500, J. Bruce Fields wrote:
On Wed, Feb 06, 2008 at 10:15:23AM -0500, Trond Myklebust wrote:
On Wed, 2008-02-06 at 10:07 -0500, J. Bruce Fields wrote:
That went into 2.6.22:
21315edd4877b593d5bf.. [PATCH] knfsd: nfsd4: demote clientid
in use
This is the tenth iteration of the patchset to convert lockd to use the
kthread API. This patchset is smaller than the earlier ones since some
of the patches in those sets have already been taken into Bruce's tree.
This set only changes lockd to use the kthread API.
The only real difference
Needed since the plan is to not have a svc_create_thread helper and to
have current users of that function just call kthread_run directly.
Signed-off-by: Jeff Layton [EMAIL PROTECTED]
Reviewed-by: NeilBrown [EMAIL PROTECTED]
Signed-off-by: J. Bruce Fields [EMAIL PROTECTED]
---
Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no longer
need for the waitqueue code at the end of lockd_down. This also means
that only one lockd can be running at a time which simplifies the code
within lockd's main
On Wed, Feb 06, 2008 at 12:52:17PM -0500, Trond Myklebust wrote:
On Wed, 2008-02-06 at 12:23 -0500, J. Bruce Fields wrote:
On Wed, Feb 06, 2008 at 10:15:23AM -0500, Trond Myklebust wrote:
On Wed, 2008-02-06 at 10:07 -0500, J. Bruce Fields wrote:
That went into 2.6.22:
On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no longer
need for the waitqueue code at the end of lockd_down. This also means
that only one lockd can be running
On Wed, 2008-02-06 at 19:24 +0100, Gabriel Barazer wrote:
Oops (tm)! Fortunately I do mostly reads, but maybe the exports(5) man
page should be updated. According to the man page, I thought that
although writes aren't commited to the block devices, the server-side
cache is correctly
On Wed, 06 Feb 2008 13:36:31 -0500
Trond Myklebust [EMAIL PROTECTED] wrote:
On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no
longer need for the waitqueue
On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
There's no guarantee that kthread_stop() won't wake up lockd before
schedule_timeout() gets called, but after the last check for
kthread_should_stop().
Doesn't the BKL pretty much eliminate this race? (assuming you transform
that call to
On Wed, 06 Feb 2008 13:52:34 -0500
Trond Myklebust [EMAIL PROTECTED] wrote:
On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
There's no guarantee that kthread_stop() won't wake up lockd before
schedule_timeout() gets called, but after the last check for
kthread_should_stop().
Hi Gianluca-
On Feb 6, 2008, at 1:25 PM, Gianluca Alberici wrote:
Hello all,
Thanks to Chuck's help i finally decided to proceed to a git bisect
and found the bad patch. Is there anybody that has an idea why it
breaks userspace nfs servers as we have seen ? Sorry for emailing
directly
On Wed, 6 Feb 2008 13:47:02 -0500
Jeff Layton [EMAIL PROTECTED] wrote:
On Wed, 06 Feb 2008 13:36:31 -0500
Trond Myklebust [EMAIL PROTECTED] wrote:
On Wed, 2008-02-06 at 13:21 -0500, Jeff Layton wrote:
Have lockd_up start lockd using kthread_run. With this change,
lockd_down now
Gabriel Barazer wrote:
On 02/06/2008 4:59:39 PM +0100, Jesper Krogh [EMAIL PROTECTED] wrote:
I have a similar setup, and I'm very curious on how you can read an
iowait value from the clients: On my nodes (server 2.6.21.5/clients
2.6.23.14), the iowait counter is only incremented when dealing
On Thu, Feb 07, 2008 at 10:19:06AM +1300, Andrew Dixie wrote:
Oh, right, I was confusing client and server reboot and assuming the
client would forget the uniquifier on server reboot. That's obviously
wrong! The client will forget its own uniquifier on client reboot, but
that's alright
On Wed, 06 Feb 2008 22:55:02 +0100
Gianluca Alberici [EMAIL PROTECTED] wrote:
I finally got it. Problem and solution have been found from 6 month but
nobody cared...up to now those servers have not been mantained, this
problem is not discussed anywhere else than the following link.
The bug
Oh, right, I was confusing client and server reboot and assuming the
client would forget the uniquifier on server reboot. That's obviously
wrong! The client will forget its own uniquifier on client reboot, but
that's alright since it's happy enough just to let that old state time
out at
What is rpciod doing while the machine hangs?
Does 'netstat -t' show an active tcp connection to the server?
Does tcpdump show any traffic going on the wire?
What server are you running against? From the error messages below, I
see it is a Linux machine, but which kernel is it
On Thu, 2008-02-07 at 11:40 +1300, Andrew Dixie wrote:
What is rpciod doing while the machine hangs?
Does 'netstat -t' show an active tcp connection to the server?
Does tcpdump show any traffic going on the wire?
What server are you running against? From the error messages below,
On Wed, 2008-02-06 at 14:09 -0500, Jeff Layton wrote:
On Wed, 06 Feb 2008 13:52:34 -0500
Trond Myklebust [EMAIL PROTECTED] wrote:
On Wed, 2008-02-06 at 13:47 -0500, Jeff Layton wrote:
There's no guarantee that kthread_stop() won't wake up lockd before
schedule_timeout() gets
2.6.23-stable review patch. If anyone has any objections, please let us know.
--
From: NeilBrown [EMAIL PROTECTED]
patch ba67a39efde8312e386c6f603054f8945433d91f in mainline.
When RPCSEC/GSS and krb5i is used, requests are padded, typically to a multiple
of 8 bytes. This can
On Feb 5, 2008, at 9:12 PM, Kevin Coffman wrote:
If the Mac server code can support other encryption types like Triple
DES and ArcFour, you shouldn't need to limit it to only the
des-cbc-crc key. The Linux nfs-utils code on the client should be
limiting the negotiated encryption type to des.
I
Hi:
I did some extensive digging into the codebase and I believe I have the
reason
why exportfs -a flushes out the caches after NFS clients have mounted
the NFS filesystem.
The analysis is complicated, but here's
the crux of the matter:
There is a difference in the /etc/exports and the kernel
Hi,
I've been looking at NLM_HOST_MAX in fs/lockd/host.c, as we have a
patch in SLES that makes it configurable, and the patch needs to
either go upstream or out the window...
But the code that uses NLM_HOST_MAX is weird! Look:
#define NLM_HOST_EXPIRE ((nrhosts NLM_HOST_MAX)? 300
At a higher level, in general, I think the kernel exports table need not
match /etc/exports at all. When we run exportfs -a again, what the
codebase intends to do is the following:
1. Scan /etc/exports and verify that an entry exists (create one if not)
in its in core exports table. Mark each of
On Wednesday February 6, [EMAIL PROTECTED] wrote:
+ dotdot.d_name.name = ..;
+ dotdot.d_name.len = 2;
+
+ lock_kernel();
+ if (!udf_find_entry(child-d_inode, dotdot, fibh, cfi))
+ goto out_unlock;
Have you ever tried this? I think this could never work. UDF doesn't
Sorry, I does look like it indeed solved the problem. Clearly, I have
missed something in my analysis of the codebase. In any case, thanks a
lot.
Good night,
Ani
-Original Message-
From: Neil Brown [mailto:[EMAIL PROTECTED]
Sent: Wednesday, February 06, 2008 9:22 PM
To: Anirban
38 matches
Mail list logo