On Tue, Aug 04, 2020 at 01:48:08PM +0200, Rainer Jung wrote:
> GDB info (sporadic) Solaris shutdown crashes during OpenSSL shutdown in
> mod_watchdog:

Awesome level of testing as usual, thanks Rainer!

I see similar crashes with mod_watchdog active for 2.4 prefork.  I think 
the trigger is also loading mod_md, which causes mod_watchdog to have 
active threads?  May be wrong.  

I started investigating mod_watchdog mutex abuse (r1876511) but in the 
end concluded that prefork ungraceful shutdown is inherently broken 
because it does everything inside a signal handler in each child, which 
is totally unsafe and unsurprisingly crashy.

In this case you have:

1) a child's main thread exiting with APEXIT_CHILDSICK (from first 
argument == 7 == APEXIT_CHILDSICK) - possibly the listener mutex got 
whacked by the parent?

2) there is a mod_watchdog thread which caught SIGTERM and is handling 
that at the same time.

It seems pretty daft that the mod_watchdog thread is catching any 
signals.  It looks like wd_worker() should call 
apr_setup_signal_thread() to block such signals - if fact any thread use 
within httpd outside of the MPMs should be doing that?

Regards, Joe

> 
> -----------------  lwp# 1 / thread# 1  --------------------
>  ff07b670 apr_pool_destroy (393280, 41d848, ffbfee19, 38c8a0, 393268, 1018)
> + 284
>  fed529e0 clean_child_exit (7, 22f, 3, 3, 9, cc4b0) + 60
>  fed52f2c child_main (fed6b93c, fed6b938, 9c71c, fed6b954, fed6b944, 9becc)
> + 344
>  fed535fc make_child (cc4b0, 2, 2, 392e50, 1, 0) + 1d0
>  fed545e4 prefork_run (0, ffbfefdc, ffbfefc8, fed6b94c, 9becc, fed6b95c) +
> 91c
>  00039e64 ap_run_mpm (a7338, ce008, cc4b0, 9bd3c, 0, 1eaa08) + 54
>  00075cfc main     (37a54, 9b718, 76d90, 9becc, 9beb8, a53c0) + 9b4
>  00031654 _start   (0, 0, 0, 0, 0, 0) + 5c
> -----------------  lwp# 2 / thread# 2  --------------------
>  fee42480 mutex_lock_impl (fce10200, 0, 0, 0, fd839278, 0) + 168
>  fd827ff8 __deregister_frame_info_bases (fd8392a8, 0, 0, 0, fd839290, 0) +
> d8
>  fd82130c ???????? (0, 0, fd8392a0, fd839628, 0, fd83962c)
>  fd828540 _fini    (ff3f418c, ff3f5b10, 2ae70, 0, ff3f48e8, 1821) + 4
>  ff3c5a5c call_fini (ff3f418c, febc1058, fd82853c, ff3f4380, ff3f4338,
> ff3f48e8) + cc
>  ff3c5c2c atexit_fini (ff3f418c, 2ed28, fee42cc0, ff3f48e8, fce10200,
> febc1058) + 78
>  fedc2374 _exithandle (feeb7500, feeb5900, 1c00, feeb9330, 24, 222c88) + 40
>  fedb0790 exit     (0, 222c88, ff076cc8, 0, fce10200, 38c904) + 4
>  fed52a18 clean_child_exit (0, 0, 0, 0, 0, 0) + 98
>  fed52a3c just_die (f, 0, fcdfba70, 1, 0, 0) + 4
>  fee4961c __sighndlr (f, 0, fcdfba70, fed52a38, 0, 1) + c
>  fee3dce8 call_user_handler (f, 0, 0, 0, fce10200, fcdfba70) + 3b8
>  fee3ded0 sigacthandler (f, 0, fcdfba70, 0, 0, 0) + 60
>  --- called from signal handler with signal 15 (SIGTERM) ---
>  fee4cdc0 __pollsys (fcdfbde8, 0, fcdfbe50, 0, 0, 0) + 8
>  fede8590 pselect  (fcdfbde8, feeb4728, feeb4728, 0, fcdfbe50, 0) + 1c8
>  fede8908 select   (0, 0, 0, 0, fcdfbeb8, f4240) + a0
>  ff087d20 apr_sleep (0, 186a0, a129c, a1298, 0, 0) + 4c
>  fe372f30 wd_worker (fe389744, 3900b0, 1, fcdfbf38, 5abe9, 815e16a) + 348
>  ff087274 dummy_worker (390ef0, fcdfc000, 0, 0, ff087268, 1) + c
>  fee494f0 _lwp_start (0, 0, 0, 0, 0, 0)
> 

Reply via email to