Re: Memory leak/server crashes

2000-01-26 Thread Doug MacEachern

there are hints in the SUPPORT doc on how to debug such problems.  there
was also several "Hanging process" threads in the past weeks with more
tips, search in the archives for keywords gdb, .gdbinit, curinfo
if you can get more insight from those tips, we can help more.

On Sun, 9 Jan 2000, James Furness wrote:

 I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
 mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
 running Redhat 6.1. We run about 5 sites off the box, most of which are
 fairly high traffic, and use a lot of CGI and
 MySQL 3.22.25 is used with Apache::DBI.
 
 The major problem seems to be a memory leak of some sort, identical to that
 described in the "memory leak in mod_perl" thread on this list from October
 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
 July 1997.
 
 The server runs normally for several hours, then suddenly a httpd process
 starts growing exponentially, the swapfile usage grows massively and the
 server starts to become sluggish (I assume due to disk thrashing caused by
 the heavy swap usage). Usually when this started to happen I would log in
 and use apachectl stop to shutdown the server, then type 'killall httpd'
 several times till the processes finally died off, and then use apachectl
 start to restart apache. If I was not around or did not catch this, the
 server would eventually become unresponsive and lock up, requiring a manual
 reboot by the datacentre staff. Messages such as "Out of memory" and
 "Callback called exit" would appear in the error log as the server spiralled
 down and MySQL would start to have trouble running.
 
 To combat this, I created a script to monitor load and swapfile usage, and
 restart apache as described above if load was above 7 and swapfile usage
 above 150Mb. This script has kept the server online and we now have an
 uptime of something like 22 days (previously no more than 1 day), but the
 script is getting triggered several times a day and no more "Out of memory"
 messages are appearing, but the situation is not ideal.
 
 I have tried adding:
 
 sub UNIVERSAL::AUTOLOAD {
 my $class = shift;
 Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
 }
 
 
 As recommended by the developers guide, which flooded the error log with the
 text below being printed roughly once a second in the error log:
 
 -
 Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
 Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 --
 
 I've pretty much exhausted any ways I can think of to trace this problem,
 such as i've tried to eliminate memory leaks in code by removing some
 scripts from mod_perl and running them under mod_cgi and i've tried tweaking
 MaxRequestsPerChild both without any success.
 
 One thing that was mentioned in a previous thread was that using 'exit'
 could confuse perl, and exit() is used fairly heavily in the scripts since
 most are converted to mod_perl from standard CGIs, but i'd prefer not to
 have to remove these since the structure of the scripts is reliant on some
 form of exit statement. Is there some alternative to exit()?
 
 I've also had a look at some of the patches to Apache.pm and Apache.xs
 suggested in the previous threads, and these seem to have been incorporated
 into mod_perl 1.21.
 
 Are there any other solutions I could try to this problem? Does anyone know
 what might be causing this?
 
 The second problem I have is when loading pages, usually CGI, but I think
 this has happened on some static pages, what IE5 describes as "Server not
 found or DNS error" is experienced. Originally I thought this was the server
 hitting MaxClients (150) since it usually occurs at the same time as massive
 surges of hits, and /server-status usually shows 150 httpd processes have
 been spawned, however I increased MaxClients to 200 recently and the error
 has continued to happen, even though /server-status doesn't show any more
 than about 170 processes spawned. I have not ruled out DNS server troubles
 or backbone problems (We've had a few routing troubles recently that slowed
 things down, but not actually cut off traffic or anything like that), but I
 am at a loss as to what else could be causing this so I thought i'd ask
 whilst i'm on the subject of server problems :)
 
 Thanks in advance,
 --
 James Furness [EMAIL PROTECTED]
 ICQ #:  4663650
 



Re: Memory leak/server crashes

2000-01-11 Thread James Furness

 Why, reinvent the wheel? I wrote Apache::VMonitor (grab from CPAN) that
 does all this and more (all but tail -f) I use it all the time, saves me a
 lot of time, since I don't have to telnet!

Ok - I will try to look into that when I get time.

  2)  Open up and hack Apache::SizeLimit and have it do a stack dump
  (Carp::croak) of what's going on... there may be some clue there.

I've done this (In a PerlRequire'd file):
--
# Use apache process size limitation
use Apache::SizeLimit;

$Apache::SizeLimit::MAX_PROCESS_SIZE = 9000;
$Apache::SizeLimit::CHECK_EVERY_N_REQUESTS = 3;
---

and 'PerlCleanupHandler Apache::SizeLimit' in httpd.conf.

The server is still getting restarted by the uptime/swapfile monitor, so I'm
not sure if this is having an effect.

I'll look into opening up sizelimit and doing a stack dump as soon as I get
time.

 Apache::GTopLimit is an advanced one :) (you are on Linux, right?) but
 Apache::SizeLimit is just file

I had some problems with GTop, I was trying to use Apache::VMonitor, I
downloaded and installed libgtop and the other packages needed (Forget which
now) and tried to install VMonitor, but it failed on make test, couldn't
locate one of the packages I definitely installed, a graphics manipulation
one from memory, but i'm writing this e-mail offline so I can't check :)

 3) try running in single mode with 'strace' (probably not a good idea for
 a production server), but you can still strace all the processes into a
 log file

Well I might be able to get a server running in single mode on a different
port and try that, it would be worth the information gained if I can sort
this problem out :)

 4) Apache::Leak ?

Ok, will look at that too.
--
James Furness [EMAIL PROTECTED]
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-11 Thread James Furness

  I'm looking for some help getting apache to run reliably. Apache 1.3.9
with
  mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of
RAM
  running Redhat 6.1. We run about 5 sites off the box, most of which are
  fairly high traffic, and use a lot of CGI and
  MySQL 3.22.25 is used with Apache::DBI.
 
  The major problem seems to be a memory leak of some sort, identical to
that
  described in the "memory leak in mod_perl" thread on this list from
October
  1997 and the "httpd, mod_perl and memory consumption (long)" thread from
  July 1997.

 [snip]

 I too have had this problem and haven't found a suitable solution.  In
 my case, though, I think the leaks are primarily do to old perl
 scripts being run under Registry and not bugs in mod_perl or perl.

Well as I said, there is a lot of old code, which I guess could be the
culprit.

 The first thing to do is to try to discover if the problem is a
 mod_perl problem or a bad script problem.  If your server can handle
 it, you could try a binary search to find which (if any) scripts make
 the problem worse.  Basically pick half your registry scripts and use
 mod_cgi.  If leaks persist, you know that you have some problem
 scripts in the ones you didn't make mod_cgi.  If leaks stop, then you
 know the problem scripts are in the ones you made mod_cgi.  Repeat as
 necessary until you have narrowed it down to a single script.  This is
 tedious though and may not be practical.

Ok, i'll try this when I get time.

 Now, let's assume the problem is in fact in mod_perl or apache or perl
 itself.  In this case I'm not sure what the best way to proceed is.  I
 think mod_perl and perl have shown themselves to be pretty good about
 not leaking memory, as has apache.  IMO it's much, much more likely a
 problem concerning Registry and impolite scripts that are misbehaving
 and leaving parts of themselves around.

Yeah, i'm willing to believe my scripts could be a cause.

 Have you tried correlating the memory surges with any page accesses?
 That may help narrow down the culprit.

I'm not really sure how I could go about doing that, any suggestions? :)
--
James Furness [EMAIL PROTECTED]
ICQ #:  4663650



Memory leak/server crashes

2000-01-09 Thread James Furness

I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
running Redhat 6.1. We run about 5 sites off the box, most of which are
fairly high traffic, and use a lot of CGI and
MySQL 3.22.25 is used with Apache::DBI.

The major problem seems to be a memory leak of some sort, identical to that
described in the "memory leak in mod_perl" thread on this list from October
1997 and the "httpd, mod_perl and memory consumption (long)" thread from
July 1997.

The server runs normally for several hours, then suddenly a httpd process
starts growing exponentially, the swapfile usage grows massively and the
server starts to become sluggish (I assume due to disk thrashing caused by
the heavy swap usage). Usually when this started to happen I would log in
and use apachectl stop to shutdown the server, then type 'killall httpd'
several times till the processes finally died off, and then use apachectl
start to restart apache. If I was not around or did not catch this, the
server would eventually become unresponsive and lock up, requiring a manual
reboot by the datacentre staff. Messages such as "Out of memory" and
"Callback called exit" would appear in the error log as the server spiralled
down and MySQL would start to have trouble running.

To combat this, I created a script to monitor load and swapfile usage, and
restart apache as described above if load was above 7 and swapfile usage
above 150Mb. This script has kept the server online and we now have an
uptime of something like 22 days (previously no more than 1 day), but the
script is getting triggered several times a day and no more "Out of memory"
messages are appearing, but the situation is not ideal.

I have tried adding:

sub UNIVERSAL::AUTOLOAD {
my $class = shift;
Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
}


As recommended by the developers guide, which flooded the error log with the
text below being printed roughly once a second in the error log:

-
Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
--

I've pretty much exhausted any ways I can think of to trace this problem,
such as i've tried to eliminate memory leaks in code by removing some
scripts from mod_perl and running them under mod_cgi and i've tried tweaking
MaxRequestsPerChild both without any success.

One thing that was mentioned in a previous thread was that using 'exit'
could confuse perl, and exit() is used fairly heavily in the scripts since
most are converted to mod_perl from standard CGIs, but i'd prefer not to
have to remove these since the structure of the scripts is reliant on some
form of exit statement. Is there some alternative to exit()?

I've also had a look at some of the patches to Apache.pm and Apache.xs
suggested in the previous threads, and these seem to have been incorporated
into mod_perl 1.21.

Are there any other solutions I could try to this problem? Does anyone know
what might be causing this?

The second problem I have is when loading pages, usually CGI, but I think
this has happened on some static pages, what IE5 describes as "Server not
found or DNS error" is experienced. Originally I thought this was the server
hitting MaxClients (150) since it usually occurs at the same time as massive
surges of hits, and /server-status usually shows 150 httpd processes have
been spawned, however I increased MaxClients to 200 recently and the error
has continued to happen, even though /server-status doesn't show any more
than about 170 processes spawned. I have not ruled out DNS server troubles
or backbone problems (We've had a few routing troubles recently that slowed
things down, but not actually cut off traffic or anything like that), but I
am at a loss as to what else could be causing this so I thought i'd ask
whilst i'm on the subject of server problems :)

Thanks in advance,
--
James Furness [EMAIL PROTECTED]
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Sean Chittenden

Try using Apache::SizeLimit as a way of controlling your
processes.  Sounds like a recursive page that performs infinite internal
requests.

-- 
Sean Chittenden  [EMAIL PROTECTED]
fingerprint = 6988 8952 0030 D640 3138  C82F 0E9A DEF1 8F45 0466

My mother once said to me, "Elwood," (she always called me Elwood)
"Elwood, in this world you must be oh so smart or oh so pleasant."
For years I tried smart.  I recommend pleasant.
-- Elwood P. Dowde, "Harvey"

On Sun, 9 Jan 2000, James Furness wrote:

 Date: Sun, 9 Jan 2000 19:58:00 -
 From: James Furness [EMAIL PROTECTED]
 Reply-To: James Furness [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Memory leak/server crashes
 
 I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
 mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
 running Redhat 6.1. We run about 5 sites off the box, most of which are
 fairly high traffic, and use a lot of CGI and
 MySQL 3.22.25 is used with Apache::DBI.
 
 The major problem seems to be a memory leak of some sort, identical to that
 described in the "memory leak in mod_perl" thread on this list from October
 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
 July 1997.
 
 The server runs normally for several hours, then suddenly a httpd process
 starts growing exponentially, the swapfile usage grows massively and the
 server starts to become sluggish (I assume due to disk thrashing caused by
 the heavy swap usage). Usually when this started to happen I would log in
 and use apachectl stop to shutdown the server, then type 'killall httpd'
 several times till the processes finally died off, and then use apachectl
 start to restart apache. If I was not around or did not catch this, the
 server would eventually become unresponsive and lock up, requiring a manual
 reboot by the datacentre staff. Messages such as "Out of memory" and
 "Callback called exit" would appear in the error log as the server spiralled
 down and MySQL would start to have trouble running.
 
 To combat this, I created a script to monitor load and swapfile usage, and
 restart apache as described above if load was above 7 and swapfile usage
 above 150Mb. This script has kept the server online and we now have an
 uptime of something like 22 days (previously no more than 1 day), but the
 script is getting triggered several times a day and no more "Out of memory"
 messages are appearing, but the situation is not ideal.
 
 I have tried adding:
 
 sub UNIVERSAL::AUTOLOAD {
 my $class = shift;
 Carp::cluck "$class can't \$UNIVERSAL::AUTOLOAD!\n";
 }
 
 
 As recommended by the developers guide, which flooded the error log with the
 text below being printed roughly once a second in the error log:
 
 -
 Apache=SCALAR(0x830937c) can't $UNIVERSAL::AUTOLOAD!
 Apache=SCALAR(0x8309364) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 DBI::DBI_tie=HASH(0x82dd16c) can't $UNIVERSAL::AUTOLOAD!
 IO::Handle=IO(0x820aabc) can't $UNIVERSAL::AUTOLOAD!
 --
 
 I've pretty much exhausted any ways I can think of to trace this problem,
 such as i've tried to eliminate memory leaks in code by removing some
 scripts from mod_perl and running them under mod_cgi and i've tried tweaking
 MaxRequestsPerChild both without any success.
 
 One thing that was mentioned in a previous thread was that using 'exit'
 could confuse perl, and exit() is used fairly heavily in the scripts since
 most are converted to mod_perl from standard CGIs, but i'd prefer not to
 have to remove these since the structure of the scripts is reliant on some
 form of exit statement. Is there some alternative to exit()?
 
 I've also had a look at some of the patches to Apache.pm and Apache.xs
 suggested in the previous threads, and these seem to have been incorporated
 into mod_perl 1.21.
 
 Are there any other solutions I could try to this problem? Does anyone know
 what might be causing this?
 
 The second problem I have is when loading pages, usually CGI, but I think
 this has happened on some static pages, what IE5 describes as "Server not
 found or DNS error" is experienced. Originally I thought this was the server
 hitting MaxClients (150) since it usually occurs at the same time as massive
 surges of hits, and /server-status usually shows 150 httpd processes have
 been spawned, however I increased MaxClients to 200 recently and the error
 has continued to happen, even though /server-status doesn't show any more
 than about 170 processes spawn

Re: Memory leak/server crashes

2000-01-09 Thread James Furness

 Try using Apache::SizeLimit as a way of controlling your
 processes.  Sounds like a recursive page that performs infinite internal
 requests.

Ok, sounds like a good solution, but it still seems to me I should be
eliminating the problem at the source. Any ideas as to how I could narrow
down the location of whatever's causing the recursion?
--
James Furness [EMAIL PROTECTED]
ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Chip Turner

"James Furness" [EMAIL PROTECTED] writes:

 I'm looking for some help getting apache to run reliably. Apache 1.3.9 with
 mod_perl 1.21 and Perl 5.005_03 is running on a dual P500 with 1 Gb of RAM
 running Redhat 6.1. We run about 5 sites off the box, most of which are
 fairly high traffic, and use a lot of CGI and
 MySQL 3.22.25 is used with Apache::DBI.
 
 The major problem seems to be a memory leak of some sort, identical to that
 described in the "memory leak in mod_perl" thread on this list from October
 1997 and the "httpd, mod_perl and memory consumption (long)" thread from
 July 1997.

[snip]

I too have had this problem and haven't found a suitable solution.  In
my case, though, I think the leaks are primarily do to old perl
scripts being run under Registry and not bugs in mod_perl or perl.

The first thing to do is to try to discover if the problem is a
mod_perl problem or a bad script problem.  If your server can handle
it, you could try a binary search to find which (if any) scripts make
the problem worse.  Basically pick half your registry scripts and use
mod_cgi.  If leaks persist, you know that you have some problem
scripts in the ones you didn't make mod_cgi.  If leaks stop, then you
know the problem scripts are in the ones you made mod_cgi.  Repeat as
necessary until you have narrowed it down to a single script.  This is
tedious though and may not be practical.

Depending on how old the scripts are, I would check for non-closed
filehandles, excessive global variables, not using strict, etc.
perl-status is your friend (hopefully you have it enabled!) so you can
see the namespaces of each httpd and see if you have any candidate
variables, file handles, functions, etc that could be clogging memory.

As a last resort, you could try Apache::SizeLimit to cap the size of
each httpd daemon.  This works reasonably well for us.  Something to
the effect:

use Apache::SizeLimit;

$Apache::SizeLimit::MAX_PROCESS_SIZE = 16384; 
$Apache::SizeLimit::CHECK_EVERY_N_REQUESTS = 3;

should help cap your processes at 16meg each.  Tweak as necessary.
Read the perldoc for Apache::SizeLimit for all the info you need.

Now, let's assume the problem is in fact in mod_perl or apache or perl
itself.  In this case I'm not sure what the best way to procede is.  I
think mod_perl and perl have shown themselves to be pretty good about
not leaking memory, as has apache.  IMO it's much, much more likely a
problem concerning Registry and impolite scripts that are misbehaving
and leaving parts of themselves around.

Have you tried correlating the memory surges with any page accesses?
That may help narrow down the culprit.

Good luck!

Chip

-- 
Chip Turner   [EMAIL PROTECTED]
  Programmer, ZFx, Inc.  www.zfx.com
  PGP key available at wwwkeys.us.pgp.net



Re: Memory leak/server crashes

2000-01-09 Thread Sean Chittenden

Yeah...  two things I'd do:

1)  Open two telnet sessions to the box.  One for top that is
monitoring processes for your web user (www typically) and is sorting by
memory usage w/ a 1 second refresh.  I'd change the size of the window and
make it pretty short so that the refreshes happen quicker, but that
depends on your connection speed.  The second telnet window is a window
that tails your access log (tail -f).  It sounds boring, but by watching
the two, you should have an idea as to when the problem happens.
2)  Open up and hack Apache::SizeLimit and have it do a stack dump
(Carp::croak) of what's going on... there may be some clue there.

Solution #1 will probably be your best bet...  Good luck (cool
site too!).  --SC

-- 
Sean Chittenden  [EMAIL PROTECTED]
fingerprint = 6988 8952 0030 D640 3138  C82F 0E9A DEF1 8F45 0466

The faster I go, the behinder I get.
-- Lewis Carroll

On Sun, 9 Jan 2000, James Furness wrote:

 Date: Sun, 9 Jan 2000 21:47:03 -
 From: James Furness [EMAIL PROTECTED]
 Reply-To: James Furness [EMAIL PROTECTED]
 To: Sean Chittenden [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: Memory leak/server crashes
 
  Try using Apache::SizeLimit as a way of controlling your
  processes.  Sounds like a recursive page that performs infinite internal
  requests.
 
 Ok, sounds like a good solution, but it still seems to me I should be
 eliminating the problem at the source. Any ideas as to how I could narrow
 down the location of whatever's causing the recursion?
 --
 James Furness [EMAIL PROTECTED]
 ICQ #:  4663650



Re: Memory leak/server crashes

2000-01-09 Thread Stas Bekman

On Sun, 9 Jan 2000, Sean Chittenden wrote:

   Yeah...  two things I'd do:
 
   1)  Open two telnet sessions to the box.  One for top that is
 monitoring processes for your web user (www typically) and is sorting by
 memory usage w/ a 1 second refresh.  I'd change the size of the window and
 make it pretty short so that the refreshes happen quicker, but that
 depends on your connection speed.  The second telnet window is a window
 that tails your access log (tail -f).  It sounds boring, but by watching
 the two, you should have an idea as to when the problem happens.

Why, reinvent the wheel? I wrote Apache::VMonitor (grab from CPAN) that
does all this and more (all but tail -f) I use it all the time, saves me a
lot of time, since I don't have to telnet!

   2)  Open up and hack Apache::SizeLimit and have it do a stack dump
 (Carp::croak) of what's going on... there may be some clue there.

Apache::GTopLimit is an advanced one :) (you are on Linux, right?) but
Apache::SizeLimit is just file

3) try running in single mode with 'strace' (probably not a good idea for
a production server), but you can still strace all the processes into a
log file

4) Apache::Leak ?

___
Stas Bekmanmailto:[EMAIL PROTECTED]  http://www.stason.org/stas
Perl,CGI,Apache,Linux,Web,Java,PC http://www.stason.org/stas/TULARC
perl.apache.orgmodperl.sourcegarden.org   perlmonth.comperl.org
single o- + single o-+ = singlesheavenhttp://www.singlesheaven.com