On Sat, 18 Dec 2004, Sonny Rao wrote:
On Fri, Dec 17, 2004 at 09:08:45PM -0600, Jon Nelson wrote:On Fri, 17 Dec 2004, Sonny Rao wrote:
On Thu, Dec 16, 2004 at 04:17:53PM -0600, John Goerzen wrote: <snip>I noticed my problems during runs of updatedb (for locate). That process basically reads a bunch of directories, probably stats all the files, but never actually opens the files. (I think it runs find.)
I wonder if this has something to do with directories or inodes.
I haven't lately noticed it with updatedb runs, but I have noticed it periodically when copying, or building ISO images or things like that.
Yes, again this sounds similar to a problem someone reported a week or two ago, where he had a directory with 500k files and was running find on it. The symptoms will be that kswapd is running a lot and system responsiveness will be very poor. Also, it's certainly possible that some of your directorys will have very few files, but what's important is the total number of inodes in memory, not any particular directory.
You should look at the number of inodes incore at this point, and this will give you an idea of whether this is the problem or not. Do this by using either "slabtop" or looking at /proc/slabinfo and looking at the number of slabs used for dentrys and inodes.
If this is indeed the problem, one stopgap measuere on > 2.6.7 kernels is to alter the value of /proc/sys/vm/vfs_cache_pressre upwards. I would try values with different orders of magnitude, i.e. 1000, 10000, 100000, etc. This should cause kswapd to be more aggressive in eliminating slab objects, which means it will hold the dcache for less time overall.
Well, I'm not really sure what I'm looking for, but I did this:
fstest -n 50 -f 10000 -s 2048
Then, in another terminal, I ran this
find . | wc -l
and from slaptop, sorted on cache size:
49396 49391 99% 0.81K 12349 4 49396K jfs_ip 46452 46228 99% 0.28K 3318 14 13272K radix_tree_node 13244 10512 79% 0.14K 473 28 1892K dentry_cache 9375 7382 78% 0.05K 125 75 500K buffer_head
At this time the disk light is (for the last 5 minutes or more now) completely inactive (except for a very occasional blip).
Looks like I cannot kill fstest or find (kill -9 has no effect). Both processes are in 'D+' mode.
I got fstest (I'm pretty sure mine is old) at:
http://samba.org/ftp/unpacked/junkcode/
What else can I do?
Hmm, okay so that basically means you have 49396 JFS inodes in-memory. Which is not an extremely large number IMO. The 7th column shows that you are using almost 50 MB of kernel memory on these inodes, which isn't a lot considering this machine has 1GB of RAM. So this certainly doesn't seem like the same issue as the other fellow.
Actually, the test above was on my laptop, with only 384 RAM and 512 swap. I saw *the first time I ran the test* the # of inodes go above 100000 (100K) but then it hovered around 47K consuming always about 50MB. I was not able to reproduce with reiserfs, xfs, ext3. (all noatime).
So now I have two more questions, Is JFS compiled with stats turned on? Does this happen on other filesystems besides JFS?
I have no idea. How do I find out? This is just the stock SuSE 9.2 kernel. (see above re: other filesystems).
IF it is compiled with stats turned on, the contents of the files in /proc/fs/jfs/ before and after you start the tests would be interesting.
Well, I have /proc/fs/jfs and there are 4 files there. I had to reboot the laptop this morning (even though "load" was about 50, the machine felt totally normal. However, the locked filesystem was interfering with normal operations, and 'sync' of course hung forever).
I'll let you know in a followup what /proc/fs/jfs says before and after testing. fstest is very easy to run, why not give it a shot yourself?
Following my earlier instructions and you, too, may be able to reproduce the problem locally.
-- Jon Nelson <[EMAIL PROTECTED]> _______________________________________________ Jfs-discussion mailing list [EMAIL PROTECTED] http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion
