Just for the record, my words were based on the limited information
I got on this issue. It is hard to make any determination based upon
two vmstat outputs. As in, I typically prefer users dump /proc/meminfo
and /proc/slabinfo every few mins.
Jeff Mahoney wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Lange wrote:
On Wed, 2007-03-14 at 17:42 -0400, Jeff Mahoney wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi John -
I'm taking a look at the memory consumption issue you reported, and I
just can't seem to reproduce it in the manner you've described. I'm
running our CVS kernel, which at this point is really the same thing as
the KOTD with two OCFS2 DLM fixes I added today that should be entirely
unrelated to this bug.
I created a file system with about 890,000 files, rebooted with mem=512M
and did a find -exec stat {}\; on the file system. I can see it sucking
down all the memory as you described, but it's not OOM killing or even
going into swap.
Agreed, it does not go into swap. It would not make any sense for the
kernel to swap cache pages since it can just flush them (if they are
clean) or write them to disk (if they are dirty).
They question is, why is it not flushing the cache unless you force it
to do so?
I only ran the test environment just long enough to see that it was
exhibiting similar behavior. Even still I did see processes killed
though I didn't see them in the logs. However, that is might be because
syslog was one of the processes killed.
My suspicion is disk activity alone might not cause oom-killer since the
kernel doesn't consider cache pages as normal memory usage. Once it
fills up the ram it might be forcing a flush on some pages just to keep
its head above water.
However, if you start firing up other applications and consuming ram in
the normal way that might trigger it. I really can't say because I've
never done that much extensive testing.
In any case, having the file system use up all the RAM that way is not
normal.
If there is no other demand for memory, why shouldn't caches stay in
memory? That's the entire point of caches. If we start OOM killing
processes due to the caches taking all the memory, that's absolutely a bug.
Here is what Sunil Mushran from Oracle had to say about the issue:
Well, kswapd is supposed to flush the caches. As in, the vm
controls the lifetime of the inodes in the inode_cache not ocfs2.
All ocfs2 can do is free the memory associated with the inode when
asked to. And it does that when you manually flush the cache. Qs is
why the vm is not doing it on its own.
So he is saying there is a problem with the way the kernel is handling
virtual memory. Why it only happens with ocfs2 on SUSE is unknown.
Now that's the interesting part. Might you be willing to try a mainline
kernel with OCFS2 1.2? If this is a SUSE problem, I'd like to isolate it.
- -Jeff
- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
iD8DBQFF+fxrLPWxlyuTD7IRAjPOAJ0WtR+NvKDxzzaYPVrlSlVKx0kF4wCgmzHL
IQZGgPj1K9V+hGqoS2gx+ro=
=U1LW
-----END PGP SIGNATURE-----
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users