I'm seeing the same thing with the prefork mpm under linux with lots of load.  
Possibly it
is something outside of the mpms themselves?  The processes definitly don't go away 
until
you kill them with SIGKILL a few times.  One thing I noticed is that, under load, 
processes
seem to spend a lot of time in the 'close connection' state.  By the time you have to 
issue
the kills, mod_status shows idle children.  Perhaps there is something that is not 
getting
cleaned up properly?

On a possibly related note, I am seeing segfaults in prefork children under load when 
the
number of children is high (over 1000).  The stacktrace looks like so and doesn't make
a lot of sense:

#0  pthread_sighandler (signo=11, ctx=
      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, __dsh = 0, 
edi = 1076506184, esi = 0, ebp = 3221223720, esp = 3221223680, ebx = 1076516568, edx = 
1, ecx = 0, eax = 0, trapno = 13, err = 0, eip = 1076467940, cs = 35, __csh = 0, 
eflags = 66070, esp_at_signal = 3221223680, ss = 43, __ssh = 0, fpstate = 0xbffff680, 
oldmask = 2147483648, cr2 = 0}) at signals.c:87
87  signals.c: No such file or directory.
    in signals.c
(gdb) where
#0  pthread_sighandler (signo=11, ctx=
      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, __dsh = 0, 
edi = 1076506184, esi = 0, ebp = 3221223720, esp = 3221223680, ebx = 1076516568, edx = 
1, ecx = 0, eax = 0, trapno = 13, err = 0, eip = 1076467940, cs = 35, __csh = 0, 
eflags = 66070, esp_at_signal = 3221223680, ss = 43, __ssh = 0, fpstate = 0xbffff680, 
oldmask = 2147483648, cr2 = 0}) at signals.c:87
#1  <signal handler called>
#2  __pthread_reset_main_thread () at internals.h:372
#3  0x402990e5 in __fork () at ptfork.c:92
#4  0x0809b2cd in ap_graceful_stop_signalled () at eval.c:41
#5  0x0809b643 in ap_graceful_stop_signalled () at eval.c:41
#6  0x0809ba62 in ap_mpm_run () at eval.c:41
#7  0x080a24ee in main () at eval.c:41
#8  0x402c2177 in __libc_start_main (main=0x80a1b9c <main>, argc=1, ubp_av=0xbffffb44, 
init=0x80637a8 <_init>, 
    fini=0x80c2f50 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffffb3c) at 
../sysdeps/generic/libc-start.c:129

Anyone have an idea how to track this down?  The ap_graceful_stop_signalled function
does *nothing* in the prefork mpm.

-adam


On Mon, Feb 04, 2002 at 02:13:52PM -0500, MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1) 
wrote:
> This was with the worker MPM.. I initially suspected the latency - 'waited
> for about 5 minutes, and then nothing happened till I issued a "kill -9"
> command. I'll try again today, and will probably post the stack trace of the
> parent process when it happens.
> 
> Thanks
> -Madhu
> 
> -----Original Message-----
> From: Greg Ames [mailto:[EMAIL PROTECTED]]
> Sent: Monday, February 04, 2002 9:58 AM
> To: [EMAIL PROTECTED]
> Subject: Re: 2.0.31 shutdown after heavy load
> 
> 
> "MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1)" wrote:
> > 
> > Hi,
> >         I'm getting the following message when I try to stop apache after
> a
> > stress-test on HPUX (using webstone).. Inspite of the SIGKILL message, the
> > process does not exit.. A second attempt is successful.. Any clues
> regarding
> > what may be happening is appreciated..
> > 
> >  [error] child process 5891 still did not exit, sending a SIGKILL.
> 
> which MPM?  and have you tried this test with different results before?
> 
> I didn't think we could do anything to block SIGKILL.  Could this be just
> dispatching latency?
> 
> Greg

-- 

        "I believe in Kadath in the cold waste, and Ultima Thule. But you
         cannot prove to me that Harvard Law School actually exists."
                        - Theodora Goss

        "I'm not like that, I have a cat, I don't need you.. My cat, and
         about 18 lines of bourne shell code replace you in life."
                        - anonymous


Adam Sussman    
Vidya Media Ventures

[EMAIL PROTECTED]

Reply via email to