Re: system slowdown - vnode related
I have still vnodes problem in 4.8-stable with /sys/kern/vfs_subr.c 1.249.2.30. Ishizuka-san, could you possibly try the following command line repeatedly while slowdown is being observed? % vmstat -m | grep '^ *vfscache' If the third number of its output is approaching or hitting the fourth, the chances are your kernel is running out of memory for namecache, which was actually the case on my machines. Hi, nagao-san. I stopped 310.locate of weekly cron and the slow down is not occurred so often. The slow down was just occurred as follows with a dual Xeon machine (Xeon 2.4GHz x 2, 2 giga byte rams, 4.8-Stable with SMP and HTT option). % sysctl -a|grep vnodes kern.maxvnodes: 14 kern.minvnodes: 33722 debug.numvnodes: 140025 debug.wantfreevnodes: 25 debug.freevnodes: 76 % vmstat -m | grep '^ *vfscache' vfscache818445 52184K 72819K102400K 197586220 0 64,128,256,512K It seems that the third number is smaller enough than fourth. I typed 'sysctl kern.maxvnodes=15' and the machine is recovered. -- [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
I have still vnodes problem in 4.8-stable with /sys/kern/vfs_subr.c 1.249.2.30. (1) #1 machine (Celeron 466 with 256 mega byte rams) % sysctl kern.maxvnodes kern.maxvnodes: 17979 % sysctl vm.zone | grep VNODE VNODE: 192,0, 18004,122,18004 This looks pretty normal to me for a quiescent system. Hi, David-san. Thank you for mail. I think the used(18004) exceeds maxvnodes(17979), isn't it ? I would actually suggest raising maxvnodes if you have lots of little files. Does the number of vnodes shoot up when 310.locate runs? The value shown above is the value at slow down time of 310.locate. The number of used vnodes is low at boot up until 310.locate invoked. Did you get a backtrace from the panics? It's too hard for me. Is there any way to do it ? -- [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
On Mon, Jun 09, 2003, Masachika ISHIZUKA wrote: I have still vnodes problem in 4.8-stable with /sys/kern/vfs_subr.c 1.249.2.30. (1) #1 machine (Celeron 466 with 256 mega byte rams) % sysctl kern.maxvnodes kern.maxvnodes: 17979 % sysctl vm.zone | grep VNODE VNODE: 192,0, 18004,122,18004 This looks pretty normal to me for a quiescent system. Hi, David-san. Thank you for mail. I think the used(18004) exceeds maxvnodes(17979), isn't it ? Only by a little bit. maxvnodes isn't a hard limit, since making it a hard limit would lead to deadlocks. Instead, the system garbage collects vnodes to keep the number roughly in line with maxvnodes. Judging by the numbers above, it's doing a pretty good job, but that's probably because, from the looks of it, you just booted the system. The reason it might make sense to increase maxvnodes is that having vnlru work overtime to keep your vnode count low may result in vnodes being freed that are still needed, e.g. by the buffer cache. This would cause the slowdown you were mentioning. (As a disclaimer, Tor Egge and Matt Dillon know far more about this than I do.) I would actually suggest raising maxvnodes if you have lots of little files. Does the number of vnodes shoot up when 310.locate runs? The value shown above is the value at slow down time of 310.locate. The number of used vnodes is low at boot up until 310.locate invoked. Did you get a backtrace from the panics? It's too hard for me. Is there any way to do it ? The panics might be unrelated to the number of vnodes, so it's important that we have additional information. See: http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
I have still vnodes problem in 4.8-stable with /sys/kern/vfs_subr.c 1.249.2.30. % sysctl kern.maxvnodes kern.maxvnodes: 17979 % sysctl vm.zone | grep VNODE VNODE: 192,0, 18004,122,18004 This looks pretty normal to me for a quiescent system. I think the used(18004) exceeds maxvnodes(17979), isn't it ? Only by a little bit. maxvnodes isn't a hard limit, since making it a hard limit would lead to deadlocks. Instead, the system garbage collects vnodes to keep the number roughly in line with maxvnodes. Judging by the numbers above, it's doing a pretty good job, but that's probably because, from the looks of it, you just booted the system. Hi, David-san. Thank you for mail. I understood. The reason it might make sense to increase maxvnodes is that having vnlru work overtime to keep your vnode count low may result in vnodes being freed that are still needed, e.g. by the buffer cache. This would cause the slowdown you were mentioning. I will try to increase kern.maxvnodes when the machine is slowdown. But I can not reproduce slowdown in experimental environment, yet. Did you get a backtrace from the panics? It's too hard for me. Is there any way to do it ? The panics might be unrelated to the number of vnodes, so it's important that we have additional information. See: http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html I'll try. Thank you very much. -- [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
On Mon, 25 May 2003, Mike Harding wrote: I'm running a very recent RELENG-4 - but I had a suspicion that this was unionfs related, so I unmounted the /usr/ports union mounts under a jail in case this was causing the problem, and haven't seen the problem since. It's possible I accidently reverted to 4.8 when I built a release, but I don't see how... 'K, I wouldn't touch anything less then 4.8-STABLE ... the last set of vnode related patches that I'm aware of were made *post* 4.8-RELEASE, which, I believe, won't be included in RELENG-4 ... ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
On Mon, 26 May 2003, Mike Harding wrote: Er - are any changes made to RELENG_4_8 that aren't made to RELENG_4? I thought it was the other way around - that 4_8 only got _some_ of the changes to RELENG_4... Ack, my fault ... sorry, wasn't thinking :( RELENG_4 is correct ... I should have confirmed my settings before blathering on ... One of the scripts I used extensively while debugging this ... a quite simple one .. was: #!/bin/tcsh while ( 1 ) echo `sysctl debug.numvnodes` - `sysctl debug.freevnodes` - `sysctl debug.vnlru_nowhere` - `ps auxl | grep vnlru | grep -v grep | awk '{print $20}'` sleep 10 end which outputs this: debug.numvnodes: 463421 - debug.freevnodes: 220349 - debug.vnlru_nowhere: 3 - vlruwt I have my maxvnodes set to 512k right now ... now, when the server hung, the output would look something like (this would be with 'default' vnodes): debug.numvnodes: 199252 - debug.freevnodes: 23 - debug.vnlru_nowhere: 12 - vlrup with the critical bit being the vlruwt - vlrup change ... with unionfs, you are using two vnodes per file, instead of one in non-union mode, which is why I went to 512k vs the default of ~256k vnodes ... it doesn't *fix* the problem, it only reduces its occurance ... ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
Ack, only 512Meg of memory? :( k, now you are beyond me on this one ... I'm running 4Gig on our server, with 2Gig allocated to kernel memory ... only thing I can suggest is try slowly incrementing up your maxvnodes and see if that helps, but, I'm not sure where your upper threshhold is with 512Meg of RAM ... On Mon, 26 May 2003, Mike Harding wrote: On my sytem, with 512 meg of memory, I have the following (default) vnode related values: bash-2.05b$ sysctl -a | grep vnode kern.maxvnodes: 36079 kern.minvnodes: 9019 vm.stats.vm.v_vnodein: 140817 vm.stats.vm.v_vnodeout: 0 vm.stats.vm.v_vnodepgsin: 543264 vm.stats.vm.v_vnodepgsout: 0 debug.sizeof.vnode: 168 debug.numvnodes: 33711 debug.wantfreevnodes: 25 debug.freevnodes: 5823 ...is this really low? Is this something that should go into tuning(7)? I searched on google and found basically nothing related to adjust vnodes - although I am admittedly flogging the system - I have leafnode+ running, a mirrored CVS tree, an experimental CVS tree, mount_union'd /usr/ports in a jaile, and so on. Damn those $1 a gigabyte drives! On Mon, 2003-05-26 at 09:12, Marc G. Fournier wrote: On Mon, 26 May 2003, Mike Harding wrote: Er - are any changes made to RELENG_4_8 that aren't made to RELENG_4? I thought it was the other way around - that 4_8 only got _some_ of the changes to RELENG_4... Ack, my fault ... sorry, wasn't thinking :( RELENG_4 is correct ... I should have confirmed my settings before blathering on ... One of the scripts I used extensively while debugging this ... a quite simple one .. was: #!/bin/tcsh while ( 1 ) echo `sysctl debug.numvnodes` - `sysctl debug.freevnodes` - `sysctl debug.vnlru_nowhere` - `ps auxl | grep vnlru | grep -v grep | awk '{print $20}'` sleep 10 end which outputs this: debug.numvnodes: 463421 - debug.freevnodes: 220349 - debug.vnlru_nowhere: 3 - vlruwt I have my maxvnodes set to 512k right now ... now, when the server hung, the output would look something like (this would be with 'default' vnodes): debug.numvnodes: 199252 - debug.freevnodes: 23 - debug.vnlru_nowhere: 12 - vlrup with the critical bit being the vlruwt - vlrup change ... with unionfs, you are using two vnodes per file, instead of one in non-union mode, which is why I went to 512k vs the default of ~256k vnodes ... it doesn't *fix* the problem, it only reduces its occurance ... ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
I'm a little confused. What state is the vnlru kernel thread in? It sounds like vnlru must be stuck. Note that you can gdb the live kernel and get a stack backtrace of any stuck process. gdb -k /kernel.debug /dev/mem (or whatever) proc N (e.g. vnlru's pid) back All the processes stuck in 'inode' are likely associated with the problem, but if that is what is causing vnlru to be stuck I would expect vnlru itself to be stuck in 'inode'. unionfs is probably responsible. I would not be surprised at all if unionfs is causing a deadlock somewhere which is creating a chain of processes stuck in 'inode' which is in turn causing vnlru to get stuck. -Matt Matthew Dillon [EMAIL PROTECTED] : :On Mon, 26 May 2003, Mike Harding wrote: : : Er - are any changes made to RELENG_4_8 that aren't made to RELENG_4? I : thought it was the other way around - that 4_8 only got _some_ of the : changes to RELENG_4... : :Ack, my fault ... sorry, wasn't thinking :( RELENG_4 is correct ... I :should have confirmed my settings before blathering on ... : :One of the scripts I used extensively while debugging this ... a quite :simple one .. was: : :#!/bin/tcsh :while ( 1 ) : echo `sysctl debug.numvnodes` - `sysctl debug.freevnodes` - `sysctl debug.vnlru_nowhere` - `ps auxl | grep vnlru | grep -v grep | awk '{print $20}'` : sleep 10 :end : :which outputs this: : :debug.numvnodes: 463421 - debug.freevnodes: 220349 - debug.vnlru_nowhere: 3 - vlruwt : :I have my maxvnodes set to 512k right now ... now, when the server hung, :the output would look something like (this would be with 'default' vnodes): : :debug.numvnodes: 199252 - debug.freevnodes: 23 - debug.vnlru_nowhere: 12 - vlrup : :with the critical bit being the vlruwt - vlrup change ... : :with unionfs, you are using two vnodes per file, instead of one in :non-union mode, which is why I went to 512k vs the default of ~256k vnodes :... it doesn't *fix* the problem, it only reduces its occurance ... :___ :[EMAIL PROTECTED] mailing list :http://lists.freebsd.org/mailman/listinfo/freebsd-stable :To unsubscribe, send any mail to [EMAIL PROTECTED] : ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
I'll try this if I can tickle the bug again. I may have just run out of freevnodes - I only have about 1-2000 free right now. I was just surprised because I have never seen a reference to tuning this sysctl. - Mike H. On Tue, 2003-05-27 at 11:09, Matthew Dillon wrote: I'm a little confused. What state is the vnlru kernel thread in? It sounds like vnlru must be stuck. Note that you can gdb the live kernel and get a stack backtrace of any stuck process. gdb -k /kernel.debug /dev/mem (or whatever) proc N(e.g. vnlru's pid) back All the processes stuck in 'inode' are likely associated with the problem, but if that is what is causing vnlru to be stuck I would expect vnlru itself to be stuck in 'inode'. unionfs is probably responsible. I would not be surprised at all if unionfs is causing a deadlock somewhere which is creating a chain of processes stuck in 'inode' which is in turn causing vnlru to get stuck. -Matt Matthew Dillon [EMAIL PROTECTED] : :On Mon, 26 May 2003, Mike Harding wrote: : : Er - are any changes made to RELENG_4_8 that aren't made to RELENG_4? I : thought it was the other way around - that 4_8 only got _some_ of the : changes to RELENG_4... : :Ack, my fault ... sorry, wasn't thinking :( RELENG_4 is correct ... I :should have confirmed my settings before blathering on ... : :One of the scripts I used extensively while debugging this ... a quite :simple one .. was: : :#!/bin/tcsh :while ( 1 ) : echo `sysctl debug.numvnodes` - `sysctl debug.freevnodes` - `sysctl debug.vnlru_nowhere` - `ps auxl | grep vnlru | grep -v grep | awk '{print $20}'` : sleep 10 :end : :which outputs this: : :debug.numvnodes: 463421 - debug.freevnodes: 220349 - debug.vnlru_nowhere: 3 - vlruwt : :I have my maxvnodes set to 512k right now ... now, when the server hung, :the output would look something like (this would be with 'default' vnodes): : :debug.numvnodes: 199252 - debug.freevnodes: 23 - debug.vnlru_nowhere: 12 - vlrup : :with the critical bit being the vlruwt - vlrup change ... : :with unionfs, you are using two vnodes per file, instead of one in :non-union mode, which is why I went to 512k vs the default of ~256k vnodes :... it doesn't *fix* the problem, it only reduces its occurance ... :___ :[EMAIL PROTECTED] mailing list :http://lists.freebsd.org/mailman/listinfo/freebsd-stable :To unsubscribe, send any mail to [EMAIL PROTECTED] : ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: system slowdown - vnode related
:I'll try this if I can tickle the bug again. : :I may have just run out of freevnodes - I only have about 1-2000 free :right now. I was just surprised because I have never seen a reference :to tuning this sysctl. : :- Mike H. The vnode subsystem is *VERY* sensitive to running out of KVM, meaning that setting too high a kern.maxvnodes value is virtually guarenteed to lockup the system under certain circumstances. If you can reliably reproduce the lockup with maxvnodes set fairly low (e.g. less then 100,000) then it ought to be easier to track the deadlock down. Historically speaking systems did not have enough physical memory to actually run out of vnodes.. they would run out of physical memory first which would cause VM pages to be reused and their underlying vnodes deallocated when the last page went away. Hence the amount of KVM being used to manage vnodes (vnode and inode structures) was kept under control. But today's Intel systems have far more physical memory relative to available KVM and it is possible for the vnode management to run out of KVM before the VM system runs out of physical memory. The vnlru kernel thread is an attempt to control this problem but it has had only mixed success in complex vnode management situations like unionfs where an operation on a vnode may cause accesses to additional underlying vnodes. In otherwords, vnlru can potentially shoot itself in the foot in such situations while trying to flush out vnodes. -Matt ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]