Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-31 Thread Dmitry Pryanishnikov


Hello!

On Thu, 25 May 2006, Konstantin Belousov wrote:

KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
(nfssvc_nfsd(): debug.mpsafenet=1  Giant));

from nfsserver/nfs_syscalls.c, line 570.

As I understand the problem, kern/vfs_lookup.c:lookup() could
aquire additional locks on Giant, indicating this by GIANTHELD
flag in nd. All processing in nfsserver already goes with Giant held,
so, I just dropped that excessive locks after return from lookup.
System with patch applied survived smoke test (client did
du on mounted dir, patch was generated from exported fs, etc.).
nfsd eats no more than 25% of CPU (with INVARIANTS).

Please, users who reported the problem and willing to help,
try the patch (generated against STABLE) and give the feedback.


  Thank you very much. Your patch actually fixes nfssvc_nfsd(): 
debug.mpsafenet=1  Giant panic during NFS mount of server's /usr.

Oddly enough, NFS mount of server's / doesn't panic the server.
My kernel config contains options QUOTA, however quotas are not enabled.
Please commit the fix, IMHO long-term breakage of such a basic functionality
(NFS server + quotas) in -STABLE branch isn't a Good Thing (TM).

Sincerely, Dmitry
--
Atlantis ISP, System Administrator
e-mail:  [EMAIL PROTECTED]
nic-hdl: LYNX-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-31 Thread Kris Kennaway
On Thu, Jun 01, 2006 at 01:06:44AM +0300, Dmitry Pryanishnikov wrote:
 
 Hello!
 
 On Thu, 25 May 2006, Konstantin Belousov wrote:
  KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
  (nfssvc_nfsd(): debug.mpsafenet=1  Giant));
 
 from nfsserver/nfs_syscalls.c, line 570.
 
 As I understand the problem, kern/vfs_lookup.c:lookup() could
 aquire additional locks on Giant, indicating this by GIANTHELD
 flag in nd. All processing in nfsserver already goes with Giant held,
 so, I just dropped that excessive locks after return from lookup.
 System with patch applied survived smoke test (client did
 du on mounted dir, patch was generated from exported fs, etc.).
 nfsd eats no more than 25% of CPU (with INVARIANTS).
 
 Please, users who reported the problem and willing to help,
 try the patch (generated against STABLE) and give the feedback.
 
   Thank you very much. Your patch actually fixes nfssvc_nfsd(): 
 debug.mpsafenet=1  Giant panic during NFS mount of server's /usr.
 Oddly enough, NFS mount of server's / doesn't panic the server.
 My kernel config contains options QUOTA, however quotas are not enabled.
 Please commit the fix, IMHO long-term breakage of such a basic functionality
 (NFS server + quotas) in -STABLE branch isn't a Good Thing (TM).

FYI, if you're not using quotas then you should remove the option from
your kernel config to avoid trashing your performance.

Kris


pgpHaBFWdNItK.pgp
Description: PGP signature


Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-26 Thread Dmitriy Kirhlarov
Hi!

On Thu, May 25, 2006 at 05:58:09PM +0300, Konstantin Belousov wrote:

 Please, users who reported the problem and willing to help,
 try the patch (generated against STABLE) and give the feedback.

I test it with RELENG_6 from 25 May 2006. It's work fine. Thank you.

WBR
-- 
Dmitriy Kirhlarov
OILspace, 26 Leninskaya sloboda, bld. 2, 2nd floor, 115280 Moscow, Russia
P:+7 495 105 7247 ext.203 F:+7 495 105 7246 E:[EMAIL PROTECTED]
OILspace - The resource enriched - www.oilspace.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


[patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-25 Thread Konstantin Belousov
On Thu, May 25, 2006 at 01:19:26AM -0400, Kris Kennaway wrote:
 On Wed, May 24, 2006 at 11:48:53PM -0400, Howard Leadmon wrote:
 
  So what's changed at that delta, under the one that works vfs_lookup.c is:
  
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.6 2006.03.31.07.39.24 kris
  
  
  Under the one that fails the vfs_lookup.c is:
  
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.7 2006.04.30.03.57.46 kris
  
  
  
   So I stand corrected on my last post, the issue is in fact in this module, 
  as
  just taking that module back to 1.80.2.6 fixes the problem with my server.  
   I
  even took multiple NFS clients and gave them a heavy workload, and CPU still
  remained reasonable, and very responsive.  As soon as I rev to the new
  version, NFS breaks badly and even a single client doing something like a du
  of a directory structure results in sluggishness and extreme CPU usage.
 
 Yep, unfortunately this commit was necessary to fix other bugs.  Jeff
 said he should have time to look at it next week.
 
 Kris

I tried to debug the problem. First, I have to admit that I cannot
reproduce the problem on GENERIC kernel. Only after QUOTAS where added,
and, correspondingly, UFS started to require Giant,
I get described behaviour. Below are the changes to GENERIC config file
I made to reproduce problem.

Index: amd64/conf/GENERIC
===
RCS file: /usr/local/arch/ncvs/src/sys/amd64/conf/GENERIC,v
retrieving revision 1.439.2.11
diff -u -r1.439.2.11 GENERIC
--- amd64/conf/GENERIC  30 Apr 2006 17:39:43 -  1.439.2.11
+++ amd64/conf/GENERIC  25 May 2006 14:44:14 -
@@ -26,6 +26,19 @@
 #hints GENERIC.hints # Default places to look for devices.
 
 makeoptionsDEBUG=-g# Build kernel with gdb(1) debug symbols
+optionsKDB
+optionsKDB_TRACE
+#options   KDB_UNATTENDED
+optionsDDB
+optionsDDB_NUMSYM
+optionsBREAK_TO_DEBUGGER
+options INVARIANTS
+options INVARIANT_SUPPORT
+options WITNESS
+options DEBUG_LOCKS
+options DEBUG_VFS_LOCKS
+options DIAGNOSTIC
+optionsMUTEX_PROFILING
 
 #options   SCHED_ULE   # ULE scheduler
 optionsSCHED_4BSD  # 4BSD scheduler
@@ -34,6 +47,7 @@
 optionsINET6   # IPv6 communications protocols
 optionsFFS # Berkeley Fast Filesystem
 optionsSOFTUPDATES # Enable FFS soft updates support
+optionsQUOTA
 optionsUFS_ACL # Support for access control lists
 optionsUFS_DIRHASH # Improve performance on big directories
 optionsMD_ROOT # MD is a potential root device

After that, server machine easily panics on 

KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
(nfssvc_nfsd(): debug.mpsafenet=1  Giant));

from nfsserver/nfs_syscalls.c, line 570.

As I understand the problem, kern/vfs_lookup.c:lookup() could
aquire additional locks on Giant, indicating this by GIANTHELD
flag in nd. All processing in nfsserver already goes with Giant held,
so, I just dropped that excessive locks after return from lookup.
System with patch applied survived smoke test (client did
du on mounted dir, patch was generated from exported fs, etc.).
nfsd eats no more than 25% of CPU (with INVARIANTS).

Please, users who reported the problem and willing to help,
try the patch (generated against STABLE) and give the feedback.

Index: nfsserver/nfs_serv.c
===
RCS file: /usr/local/arch/ncvs/src/sys/nfsserver/nfs_serv.c,v
retrieving revision 1.156.2.2
diff -u -r1.156.2.2 nfs_serv.c
--- nfsserver/nfs_serv.c13 Mar 2006 03:06:49 -  1.156.2.2
+++ nfsserver/nfs_serv.c25 May 2006 14:44:25 -
@@ -569,6 +569,10 @@
 
error = lookup(ind);
ind.ni_dvp = NULL;
+   if (ind.ni_cnd.cn_flags  GIANTHELD) {
+   mtx_unlock(Giant);
+   ind.ni_cnd.cn_flags = ~GIANTHELD;
+   }
 
if (error == 0) {
/*
@@ -1915,6 +1919,10 @@
 
error = lookup(nd);
nd.ni_dvp = NULL;
+   if (nd.ni_cnd.cn_flags  GIANTHELD) {
+   mtx_unlock(Giant);
+   nd.ni_cnd.cn_flags = ~GIANTHELD;
+   }
if (error)
goto ereply;
 
@@ -2141,6 +2149,10 @@
 
error = lookup(nd);
nd.ni_dvp = NULL;
+   if (nd.ni_cnd.cn_flags  GIANTHELD) {
+   mtx_unlock(Giant);
+   nd.ni_cnd.cn_flags = ~GIANTHELD;
+   

Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-25 Thread Rong-en Fan

On 5/25/06, Konstantin Belousov [EMAIL PROTECTED] wrote:

On Thu, May 25, 2006 at 01:19:26AM -0400, Kris Kennaway wrote:
 On Wed, May 24, 2006 at 11:48:53PM -0400, Howard Leadmon wrote:

  So what's changed at that delta, under the one that works vfs_lookup.c is:
 
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.6 2006.03.31.07.39.24 kris
 
 
  Under the one that fails the vfs_lookup.c is:
 
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.7 2006.04.30.03.57.46 kris
 
 
 
   So I stand corrected on my last post, the issue is in fact in this module, 
as
  just taking that module back to 1.80.2.6 fixes the problem with my server.  
 I
  even took multiple NFS clients and gave them a heavy workload, and CPU still
  remained reasonable, and very responsive.  As soon as I rev to the new
  version, NFS breaks badly and even a single client doing something like a du
  of a directory structure results in sluggishness and extreme CPU usage.

 Yep, unfortunately this commit was necessary to fix other bugs.  Jeff
 said he should have time to look at it next week.

 Kris

I tried to debug the problem. First, I have to admit that I cannot
reproduce the problem on GENERIC kernel. Only after QUOTAS where added,
and, correspondingly, UFS started to require Giant,
I get described behaviour. Below are the changes to GENERIC config file
I made to reproduce problem.


[...]

After that, server machine easily panics on

KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
(nfssvc_nfsd(): debug.mpsafenet=1  Giant));

from nfsserver/nfs_syscalls.c, line 570.

As I understand the problem, kern/vfs_lookup.c:lookup() could
aquire additional locks on Giant, indicating this by GIANTHELD
flag in nd. All processing in nfsserver already goes with Giant held,
so, I just dropped that excessive locks after return from lookup.
System with patch applied survived smoke test (client did
du on mounted dir, patch was generated from exported fs, etc.).
nfsd eats no more than 25% of CPU (with INVARIANTS).

Please, users who reported the problem and willing to help,
try the patch (generated against STABLE) and give the feedback.


[...]

Hi Konstantin and others,

I'm now running RELENG_6_1 as of Apr 30 04:00 UTC source + your
patch. The nfsd is quite happy! After client's du finishes, it
stays idle as expected (eats 0.00% CPU).

Thank you very much.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-25 Thread Kris Kennaway
On Thu, May 25, 2006 at 05:58:09PM +0300, Konstantin Belousov wrote:

 +options  QUOTA
  options  UFS_ACL # Support for access control lists
  options  UFS_DIRHASH # Improve performance on big directories
  options  MD_ROOT # MD is a potential root device
 
 After that, server machine easily panics on 
 
   KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
   (nfssvc_nfsd(): debug.mpsafenet=1  Giant));
 
 from nfsserver/nfs_syscalls.c, line 570.

OK, I am also seeing this panic when I try and export a non-mpsafe
filesystem (e.g. cd9660).  I can't test the patch because my NFS
server subsequently blew up :-(

Kris


pgpQYaxj9UfkJ.pgp
Description: PGP signature