Memory leak/server crashes

James Furness Sun, 9 Jan 2000 13:29:39 -0800
I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
running Redhat 6.1. We run about 5 sites off the box, most of which are
fairly high traffic, and use a lot of CGI and
MySQL 3.22.25 is used with Apache::DBI.

The major problem seems to be a memory leak of some sort, identical to that
described in the "memory leak in mod_perl" thread on this list from October
1997 and the "httpd, mod_perl and memory consumption (long)" thread from
July 1997.

The server runs normally for several hours, then suddenly a httpd process
starts growing exponentially, the swapfile usage grows massively and the
server starts to become sluggish (I assume due to disk thrashing caused by
the heavy swap usage). Usually when this started to happen I would log in
and use apachectl stop to shutdown the server, then type 'killall httpd'
several times till the processes finally died off, and then use apachectl
start to restart apache. If I was not around or did not catch this, the
server would eventually become unresponsive and lock up, requiring a manual
reboot by the datacentre staff. Messages such as "Out of memory" and
"Callback called exit" would appear in the error log as the server spiralled
down and MySQL would start to have trouble running.

To combat this, I created a script to monitor load and swapfile usage, and
restart apache as described above if load was above 7 and swapfile usage
above 150Mb. This script has kept the server online and we now have an
uptime of something like 22 days (previously no more than 1 day), but the
script is getting triggered several times a day and no more "Out of memory"
messages are appearing, but the situation is not ideal.

I have tried adding:

    sub UNIVERSAL::AUTOLOAD {
        my $class = shift;
        Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
    }


As recommended by the developers guide, which flooded the error log with the
text below being printed roughly once a second in the error log:

---------
Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
----------

I've pretty much exhausted any ways I can think of to trace this problem,
such as i've tried to eliminate memory leaks in code by removing some
scripts from mod_perl and running them under mod_cgi and i've tried tweaking
MaxRequestsPerChild both without any success.

One thing that was mentioned in a previous thread was that using 'exit'
could confuse perl, and exit() is used fairly heavily in the scripts since
most are converted to mod_perl from standard CGIs, but i'd prefer not to
have to remove these since the structure of the scripts is reliant on some
form of exit statement. Is there some alternative to exit()?

I've also had a look at some of the patches to Apache.pm and Apache.xs
suggested in the previous threads, and these seem to have been incorporated
into mod_perl 1.21.

Are there any other solutions I could try to this problem? Does anyone know
what might be causing this?

The second problem I have is when loading pages, usually CGI, but I think
this has happened on some static pages, what IE5 describes as "Server not
found or DNS error" is experienced. Originally I thought this was the server
hitting MaxClients (150) since it usually occurs at the same time as massive
surges of hits, and /server-status usually shows 150 httpd processes have
been spawned, however I increased MaxClients to 200 recently and the error
has continued to happen, even though /server-status doesn't show any more
than about 170 processes spawned. I have not ruled out DNS server troubles
or backbone problems (We've had a few routing troubles recently that slowed
things down, but not actually cut off traffic or anything like that), but I
am at a loss as to what else could be causing this so I thought i'd ask
whilst i'm on the subject of server problems :)

Thanks in advance,
--
James Furness <[EMAIL PROTECTED]>
ICQ #:  4663650
Memory leak/server crashes

Reply via email to