[EMAIL PROTECTED] (Doug MacEachern) wrote:
>there are hints in the SUPPORT doc on how to debug such problems.  there
>was also several "Hanging process" threads in the past weeks with more
>tips, search in the archives for keywords gdb, .gdbinit, curinfo
>if you can get more insight from those tips, we can help more.

I have also seen (and reported via the Debian bugs system) a problem
which I think has been observed before where HUP'ing the root httpd
causes it to reload every darn PerlModule directive, and bloat
accordingly (with our server, that's 3M per SIGHUP, which makes log
rotation somewhat painful (when you get 3 million hits in 8 hours, you
rotate those logs pretty fast!)).

It also appears that when you SIGHUP it WITHOUT any PerlModules, it
still leaks memory (even with PerlFreshRestart Off), reloading all the
standard implicit Apache stuff, viz (strace on the root PID):

open("/usr/lib/perl5/5.005/i386-linux/Apache/Constants.pm", O_RDONLY) = 5
brk(0x8223000)                          = 0x8223000
fstat(5, {st_mode=S_IFREG|0644, st_size=3791, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0
x402dd000
read(5, "package Apache::Constants;\n\nuse "..., 4096) = 3791
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
brk(0x8224000)                          = 0x8224000
brk(0x8225000)                          = 0x8225000
brk(0x8226000)                          = 0x8226000
close(5)                                = 0
munmap(0x402dd000, 4096)                = 0
brk(0x8227000)                          = 0x8227000
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0




>On Sun, 9 Jan 2000, James Furness wrote:
>>> The server runs normally for several hours, then suddenly a httpd process
>> starts growing exponentially, the swapfile usage grows massively and the
>> server starts to become sluggish (I assume due to disk thrashing caused by
>> the heavy swap usage). Usually when this started to happen I would log in
>> and use apachectl stop to shutdown the server, then type 'killall httpd'
>> several times till the processes finally died off, and then use apachectl
>> start to restart apache. If I was not around or did not catch this, the
>> server would eventually become unresponsive and lock up, requiring a manual
>> reboot by the datacentre staff. Messages such as "Out of memory" and
>> "Callback called exit" would appear in the error log as the server spiralled
>> down and MySQL would start to have trouble running.

Seen that, (Debian/i486 2.2.12 kernal, Apache 1.3.9. mod_perl 1.21) to
the extent that our production webservers run:
---------------------------
#!/bin/bash

# cron checks this is running every so often
# You DON'T run this from cron because if you get
# half a dozen runaways, cron doesn't get a look in 
# in time to save you

echo $$ > /var/run/kill_runaways.pid
runawaylogfile=/var/log/kill_runaways.log

while true
do
  runaways=`ps auxOv | awk '/httpd/ {if ($5>30000) print $2}'`
  if [ ! -z "$runaways" ]
  then
    echo $runaways | xargs echo "`date`: killing:" \
       >> $runawaylogfile
    echo $runaways | xargs kill -9   fi
  sleep 10
done      
------------------------
Have not yet managed to figure out what's going on, except that strace
once it's running away just shows call after call to brk(), and trying
to do anything clever to catch it just before it starts brk()'ing
doesn't help.
-- 
Mike Whitaker     | Work: +44 1733 766619 | Work: [EMAIL PROTECTED]
Technical Manager | Fax:  +44 1733 348287 | Home: [EMAIL PROTECTED]
CricInfo Ltd      | GSM:  +44 7971 977375 | Web: http://www.cricket.org/

Reply via email to