Depends how long before you start seeing oom kills. Running over the
weekend should work.
Cline, Ernie wrote:
The latest of kernel as well:
Linux cyber2.petersons.net 2.6.9-42.0.8.ELsmp #1 SMP Tue Jan 23 13:01:26
EST 2007 i686 i686 i386 GNU/Linux
For how long do you want that? I'll let it run over the weekend.
-----Original Message-----
From: Sunil Mushran [mailto:[EMAIL PROTECTED]
Sent: Friday, February 23, 2007 2:25 PM
To: Cline, Ernie
Cc: [email protected]
Subject: Re: [Ocfs2-users] OCFS 1.2.4 memory problems still?
Start monitoring /proc/slabinfo and /proc/meminfo. Dump it to a file
every 5-10 mins.
Which version of the rhel4 kernel are you on: uname -a?
Cline, Ernie wrote:
I have a 2 node cluster of HP DL380G4s. These machines are attached
via
scsi to an external HP disk enclosure. They run 32bit RH AS 4.0 and
OCFS 1.2.4, the latest release. They were upgraded from 1.2.3 only a
few days after 1.2.4 was released. I had reported on the mailing list
that my developers were happy, and things seemed faster. However,
twice
in that time, the cluster has gone down due to the kernel OOM killer
killing processes, and then ASR kicks in, and eventually reboots the
box.
I am also starting to notice some directory corruption, and errors
like
this in /var/log/messages
Feb 18 04:14:37 cyber1 kernel: (23693,1):ocfs2_check_dir_entry:1703
ERROR: bad entry in directory #101726961: rec_len % 4 != 0 - offset=0,
inode=3484598105688391, rec_len=18, name_len=128
Sometimes I can't delete a directory, it will tell me its not empty,
even though it is.
What could this be? I was hoping that OCFS 1.2.4 would have fixed the
out of memory problems, but it looks like I still run into it. What
information can I provide that will help?
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users