Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
Hello! On Thu, 25 May 2006, Konstantin Belousov wrote: KASSERT(!(debug_mpsafenet == 1 mtx_owned(Giant)), (nfssvc_nfsd(): debug.mpsafenet=1 Giant)); from nfsserver/nfs_syscalls.c, line 570. As I understand the problem, kern/vfs_lookup.c:lookup() could aquire additional locks on Giant, indicating this by GIANTHELD flag in nd. All processing in nfsserver already goes with Giant held, so, I just dropped that excessive locks after return from lookup. System with patch applied survived smoke test (client did du on mounted dir, patch was generated from exported fs, etc.). nfsd eats no more than 25% of CPU (with INVARIANTS). Please, users who reported the problem and willing to help, try the patch (generated against STABLE) and give the feedback. Thank you very much. Your patch actually fixes nfssvc_nfsd(): debug.mpsafenet=1 Giant panic during NFS mount of server's /usr. Oddly enough, NFS mount of server's / doesn't panic the server. My kernel config contains options QUOTA, however quotas are not enabled. Please commit the fix, IMHO long-term breakage of such a basic functionality (NFS server + quotas) in -STABLE branch isn't a Good Thing (TM). Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: [EMAIL PROTECTED] nic-hdl: LYNX-RIPE ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
On Thu, Jun 01, 2006 at 01:06:44AM +0300, Dmitry Pryanishnikov wrote: Hello! On Thu, 25 May 2006, Konstantin Belousov wrote: KASSERT(!(debug_mpsafenet == 1 mtx_owned(Giant)), (nfssvc_nfsd(): debug.mpsafenet=1 Giant)); from nfsserver/nfs_syscalls.c, line 570. As I understand the problem, kern/vfs_lookup.c:lookup() could aquire additional locks on Giant, indicating this by GIANTHELD flag in nd. All processing in nfsserver already goes with Giant held, so, I just dropped that excessive locks after return from lookup. System with patch applied survived smoke test (client did du on mounted dir, patch was generated from exported fs, etc.). nfsd eats no more than 25% of CPU (with INVARIANTS). Please, users who reported the problem and willing to help, try the patch (generated against STABLE) and give the feedback. Thank you very much. Your patch actually fixes nfssvc_nfsd(): debug.mpsafenet=1 Giant panic during NFS mount of server's /usr. Oddly enough, NFS mount of server's / doesn't panic the server. My kernel config contains options QUOTA, however quotas are not enabled. Please commit the fix, IMHO long-term breakage of such a basic functionality (NFS server + quotas) in -STABLE branch isn't a Good Thing (TM). FYI, if you're not using quotas then you should remove the option from your kernel config to avoid trashing your performance. Kris pgpHaBFWdNItK.pgp Description: PGP signature
Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
Hi! On Thu, May 25, 2006 at 05:58:09PM +0300, Konstantin Belousov wrote: Please, users who reported the problem and willing to help, try the patch (generated against STABLE) and give the feedback. I test it with RELENG_6 from 25 May 2006. It's work fine. Thank you. WBR -- Dmitriy Kirhlarov OILspace, 26 Leninskaya sloboda, bld. 2, 2nd floor, 115280 Moscow, Russia P:+7 495 105 7247 ext.203 F:+7 495 105 7246 E:[EMAIL PROTECTED] OILspace - The resource enriched - www.oilspace.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
[patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
On Thu, May 25, 2006 at 01:19:26AM -0400, Kris Kennaway wrote: On Wed, May 24, 2006 at 11:48:53PM -0400, Howard Leadmon wrote: So what's changed at that delta, under the one that works vfs_lookup.c is: Edit src/sys/kern/vfs_lookup.c Add delta 1.80.2.6 2006.03.31.07.39.24 kris Under the one that fails the vfs_lookup.c is: Edit src/sys/kern/vfs_lookup.c Add delta 1.80.2.7 2006.04.30.03.57.46 kris So I stand corrected on my last post, the issue is in fact in this module, as just taking that module back to 1.80.2.6 fixes the problem with my server. I even took multiple NFS clients and gave them a heavy workload, and CPU still remained reasonable, and very responsive. As soon as I rev to the new version, NFS breaks badly and even a single client doing something like a du of a directory structure results in sluggishness and extreme CPU usage. Yep, unfortunately this commit was necessary to fix other bugs. Jeff said he should have time to look at it next week. Kris I tried to debug the problem. First, I have to admit that I cannot reproduce the problem on GENERIC kernel. Only after QUOTAS where added, and, correspondingly, UFS started to require Giant, I get described behaviour. Below are the changes to GENERIC config file I made to reproduce problem. Index: amd64/conf/GENERIC === RCS file: /usr/local/arch/ncvs/src/sys/amd64/conf/GENERIC,v retrieving revision 1.439.2.11 diff -u -r1.439.2.11 GENERIC --- amd64/conf/GENERIC 30 Apr 2006 17:39:43 - 1.439.2.11 +++ amd64/conf/GENERIC 25 May 2006 14:44:14 - @@ -26,6 +26,19 @@ #hints GENERIC.hints # Default places to look for devices. makeoptionsDEBUG=-g# Build kernel with gdb(1) debug symbols +optionsKDB +optionsKDB_TRACE +#options KDB_UNATTENDED +optionsDDB +optionsDDB_NUMSYM +optionsBREAK_TO_DEBUGGER +options INVARIANTS +options INVARIANT_SUPPORT +options WITNESS +options DEBUG_LOCKS +options DEBUG_VFS_LOCKS +options DIAGNOSTIC +optionsMUTEX_PROFILING #options SCHED_ULE # ULE scheduler optionsSCHED_4BSD # 4BSD scheduler @@ -34,6 +47,7 @@ optionsINET6 # IPv6 communications protocols optionsFFS # Berkeley Fast Filesystem optionsSOFTUPDATES # Enable FFS soft updates support +optionsQUOTA optionsUFS_ACL # Support for access control lists optionsUFS_DIRHASH # Improve performance on big directories optionsMD_ROOT # MD is a potential root device After that, server machine easily panics on KASSERT(!(debug_mpsafenet == 1 mtx_owned(Giant)), (nfssvc_nfsd(): debug.mpsafenet=1 Giant)); from nfsserver/nfs_syscalls.c, line 570. As I understand the problem, kern/vfs_lookup.c:lookup() could aquire additional locks on Giant, indicating this by GIANTHELD flag in nd. All processing in nfsserver already goes with Giant held, so, I just dropped that excessive locks after return from lookup. System with patch applied survived smoke test (client did du on mounted dir, patch was generated from exported fs, etc.). nfsd eats no more than 25% of CPU (with INVARIANTS). Please, users who reported the problem and willing to help, try the patch (generated against STABLE) and give the feedback. Index: nfsserver/nfs_serv.c === RCS file: /usr/local/arch/ncvs/src/sys/nfsserver/nfs_serv.c,v retrieving revision 1.156.2.2 diff -u -r1.156.2.2 nfs_serv.c --- nfsserver/nfs_serv.c13 Mar 2006 03:06:49 - 1.156.2.2 +++ nfsserver/nfs_serv.c25 May 2006 14:44:25 - @@ -569,6 +569,10 @@ error = lookup(ind); ind.ni_dvp = NULL; + if (ind.ni_cnd.cn_flags GIANTHELD) { + mtx_unlock(Giant); + ind.ni_cnd.cn_flags = ~GIANTHELD; + } if (error == 0) { /* @@ -1915,6 +1919,10 @@ error = lookup(nd); nd.ni_dvp = NULL; + if (nd.ni_cnd.cn_flags GIANTHELD) { + mtx_unlock(Giant); + nd.ni_cnd.cn_flags = ~GIANTHELD; + } if (error) goto ereply; @@ -2141,6 +2149,10 @@ error = lookup(nd); nd.ni_dvp = NULL; + if (nd.ni_cnd.cn_flags GIANTHELD) { + mtx_unlock(Giant); + nd.ni_cnd.cn_flags = ~GIANTHELD; +
Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
On 5/25/06, Konstantin Belousov [EMAIL PROTECTED] wrote: On Thu, May 25, 2006 at 01:19:26AM -0400, Kris Kennaway wrote: On Wed, May 24, 2006 at 11:48:53PM -0400, Howard Leadmon wrote: So what's changed at that delta, under the one that works vfs_lookup.c is: Edit src/sys/kern/vfs_lookup.c Add delta 1.80.2.6 2006.03.31.07.39.24 kris Under the one that fails the vfs_lookup.c is: Edit src/sys/kern/vfs_lookup.c Add delta 1.80.2.7 2006.04.30.03.57.46 kris So I stand corrected on my last post, the issue is in fact in this module, as just taking that module back to 1.80.2.6 fixes the problem with my server. I even took multiple NFS clients and gave them a heavy workload, and CPU still remained reasonable, and very responsive. As soon as I rev to the new version, NFS breaks badly and even a single client doing something like a du of a directory structure results in sluggishness and extreme CPU usage. Yep, unfortunately this commit was necessary to fix other bugs. Jeff said he should have time to look at it next week. Kris I tried to debug the problem. First, I have to admit that I cannot reproduce the problem on GENERIC kernel. Only after QUOTAS where added, and, correspondingly, UFS started to require Giant, I get described behaviour. Below are the changes to GENERIC config file I made to reproduce problem. [...] After that, server machine easily panics on KASSERT(!(debug_mpsafenet == 1 mtx_owned(Giant)), (nfssvc_nfsd(): debug.mpsafenet=1 Giant)); from nfsserver/nfs_syscalls.c, line 570. As I understand the problem, kern/vfs_lookup.c:lookup() could aquire additional locks on Giant, indicating this by GIANTHELD flag in nd. All processing in nfsserver already goes with Giant held, so, I just dropped that excessive locks after return from lookup. System with patch applied survived smoke test (client did du on mounted dir, patch was generated from exported fs, etc.). nfsd eats no more than 25% of CPU (with INVARIANTS). Please, users who reported the problem and willing to help, try the patch (generated against STABLE) and give the feedback. [...] Hi Konstantin and others, I'm now running RELENG_6_1 as of Apr 30 04:00 UTC source + your patch. The nfsd is quite happy! After client's du finishes, it stays idle as expected (eats 0.00% CPU). Thank you very much. Regards, Rong-En Fan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?
On Thu, May 25, 2006 at 05:58:09PM +0300, Konstantin Belousov wrote: +options QUOTA options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options MD_ROOT # MD is a potential root device After that, server machine easily panics on KASSERT(!(debug_mpsafenet == 1 mtx_owned(Giant)), (nfssvc_nfsd(): debug.mpsafenet=1 Giant)); from nfsserver/nfs_syscalls.c, line 570. OK, I am also seeing this panic when I try and export a non-mpsafe filesystem (e.g. cd9660). I can't test the patch because my NFS server subsequently blew up :-( Kris pgpQYaxj9UfkJ.pgp Description: PGP signature