Warren mentioned an issue familiar to me when he brought up some troubles that he was having with a server locking down. He and Ray have figured it to be a hardware problem. I have experienced something familiar, and I am wondering what conditions can lead to a server lock up with no hints in the logs and why it definitely is a hardware problem.

In my scenario, I have had a server lock up after about 5 days of hard use. It has happened twice. Both times I was using Redhat's 7.2 Enterprise kernel (2.4.9-34enterprise). I blamed it on a default kernel setting that I did not understand. I changed to the stock 2.4.9-34smp kernel with Rhat 7.2. After about 30 days, the same lockup. By lockup I mean that both remote and local terminal sessions are frozen. Pressing ctrl + alt + del will not reboot. My only hint is a series of "failed to set personality on (some pid #)" on the screen. An ugly power down is the only "fix."

Upon reboot, there are no hints in the logs. This is to say, there are no hints in the var/log directory. Perhaps I could look somewhere else. As far as the logs and server are concerned, everything is just hunky-dorry. Here is what I wonder:

What can cause this? Is the machine that is locking up on Warren and Ray staying up for as many days as mine? Can hardware problems take 30 days to manifest themselves?

I have been told that /proc/sys/fs/file-max must be set high enough to handle one's active files. If this number is reached, does a server lock? Is there a way to check how many files are open?

Is there another software or kernel setting that can lead to a lock down, say, max inodes or something?

If you have any suggestions or insights or experiences that you can share, I would be most gracious.

scott

Reply via email to