Rainer, the crashes seem the same I see when running mod-md tests: a watchdog being busy with openssl while the process shuts down.
Yann wrote a patch a while back which solved it for me. But we were reluctant to make that change. Maybe it‘s time to revisit that. > Am 30.03.2020 um 01:22 schrieb Rainer Jung <[email protected]>: > > Am 26.03.2020 um 15:50 schrieb Daniel Ruggeri: >> Hi, all; >> Please find below the proposed release tarball and signatures: >> https://dist.apache.org/repos/dist/dev/httpd/ >> I would like to call a VOTE over the next few days to release this >> candidate tarball as 2.4.43: >> [X] +1: It's not just good, it's good enough! >> [ ] +0: Let's have a talk. >> [ ] -1: There's trouble in paradise. Here's what's wrong. >> The computed digests of the tarball up for vote are: >> sha1: 15d8605b094dfe5e283cd9e90770368dd14e26f2 *httpd-2.4.43.tar.gz >> sha256: 2624e92d89b20483caeffe514a7c7ba93ab13b650295ae330f01c35d5b50d87f >> *httpd-2.4.43.tar.gz >> sha512: >> d9879b8f8ef7d94dee1024e9c25b56d963a3b072520878a88a044629ad577c109a5456791b39016bf4f6672c04bf4a0e5cfd32381211e9acdc81d4a50b359e5e >> *httpd-2.4.43.tar.gz >> The SVN tag is '2.4.43' at r1875715. > > +1 to release and thanks a bunch for RM! > > Summary: all OK except for > > - very few shutdown crashes on Solaris (already observed in 2.4.37, but then > with MPM event when statically linked, now once with prefork and shared > linking). Happens in mod_watchdog. Maybe prefork doesn't expect another > thread running and doing deinit. gdb info at end. > > - another shutdown crash on Solaris in mod_watchdog for prefork. gdb info at > end. > > - sporadic hangs with prefork plus mod_ext_filter on Linux (see separate > thread). > > Detailed report: > > - Sigs and hashes OK > - contents of tarballs identical > - contents of tag and tarballs identical > except for expected deltas > > Built on > > - Solaris 10 Sparc as 32 Bit Binaries > - SLES 11+12+15 (64 Bits) > - RHEL 6+7 (64 Bits) > > For all platforms built > > - with default (shared) and static modules > - with module set reallyall > - using --enable-load-all-modules > - against external APR/APU 1.7.0/1.6.1 > plus APR/APU 1.6.5/1.6.1 > plus APR/APU 1.7.x HEAD/1.7.x HEAD with expat > plus APR/APU 1.7.x HEAD/1.7.x HEAD with libxml2 > plus APR/APU from deps tarball > > - using external libraries > - expat 2.2.9 > - pcre 8.44 > - lua 5.3.5 (compiled with LUA_COMPAT_MODULE) > - libxml2 2.9.10 > - libnghttp2 1.40.0 > - brotli 1.0.7 > - curl 7.69.1 > - jansson 2.12 > and > - openssl 0.9.8zh, 1.0.2, 1.0.2u, 1.0.1e, 1.0.1l, 1.1.1, 1.1.1e plus patches > (head of a few days ago) > > - Tool chain: > - platform gcc except on Solaris > (gcc 9.3.0 Solaris 10) > - CFLAGS: -O2 -g -Wall -fno-strict-aliasing > - on Solaris additionally -mpcu=v9, -D_XOPEN_SOURCE, > -D_XOPEN_SOURCE_EXTENDED=1, -D__EXTENSIONS__ > and -D_XPG6 > > All of the 660 builds succeeded, a few are still ongoing. > > - compiler warnings: only on Solaris (GCC 9.3.0): > srclib/apr/locks/unix/proc_mutex.c:979:49: warning: > 'mutex_proc_pthread_cond_methods' defined but not used > [-Wunused-const-variable=] > > Tested for > > - Solaris 10, SLES 11+12+15, RHEL 6+7 > - MPMs prefork, worker, event > - default and static module builds > - log level trace8 > - module set reallyall (127 modules) > - Perl client bundle build against OpenSSL 1.1.1, 1.1.0i, 1.0.2p and 0.9.8zh > - OpenSSL once linked statically and once as a shared library > > Every OpenSSL version in the client tested with 1.1.1e plus patches in the > server. Tests for more server OpenSSL versions are ongoing. > > The total number of test suite runs was 680 (many more to come ...). > > The following test failures were seen: > > a Crashes only on Solaris, only with prefork MPM and > dynamically linked builds. > The crash seems to happen only at the end of a process during pchild > clean up and it might be problematic, that the watchdog thread at that > time still exists. > gdb info see at end. > > b Tests 4, 8 and 12 of t/modules/buffer.t > Not a regression > Tests 4, 8 and sometimes 12, always line 37 > Relatively frequent (93 times) failures on platforms Solaris 10 > and old SLES11 (114 times), RHEL 6 (88 times), but not on more modern > (and here faster) SLES 12 and RHEL 7. > Happens for all OpenSSL client and server > versions and all link types. > > c Test 5 in t/modules/dav.t line 69: > Not a regression. > 22 times: twice RHEL 6, 3 times RHEL 7 and 5 times SLES 11, > 8 times SLES 12, twice SLES 15, twice Solaris 10. > Creation, modified and now times not in the correct order. > This seems to be a system issue, all tests done on NFS, > many tested on virtualized guests. > > d Tests 45, 48, 51, 54 in t/modules/cgi.t line 232: > Not a regression > 125 times once Solaris > Test checks log contents. Could be false positive due to > logs written to NFS. > > Regards, > > Rainer > > GDB info (sporadic) Solaris shutdown crashes during OpenSSL shutdown in > mod_watchdog: > > fedd7668 realfree (1c0268, 61, 60, 1bfc58, 0, 1c02d8) + 154 > fedd7d9c _free_unlocked (feeb92b0, 0, d86dc, feeb9330, feeb03d8, 1c99f8) + b0 > fedd7cd8 free (1c99f8, fe9a3ad0, d871c, fe8e4ee0, feeb03d8, feeb3a20) + 24 > fe8e2390 OPENSSL_LH_free (1bce10, fe9a3ad0, feeb5900, 2, fe9a3ad0, 1cd450) + > 64 > fe8bcf44 err_cleanup (0, f8800, feeb5900, fe9ed05c, fe8db4f0, fe9ecf4c) + 94 > fe8dee54 OPENSSL_cleanup (1, fe9ed284, fe9d3598, fe9a3240, fe9ed25c, > fe9ed280) + 1e4 > fedc2374 _exithandle (feeb7500, feeb5900, 1c00, feeb9330, 24, 218910) + 40 > fedb0790 exit (0, 218910, ff076cc8, 0, fce40200, 39b1a4) + 4 > fed62a18 clean_child_exit (0, 0, 0, 0, 0, 0) + 98 > fed62a3c just_die (f, 0, fcdfba70, 1, 0, 0) + 4 > fee4961c __sighndlr (f, 0, fcdfba70, fed62a38, 0, 1) + c > fee3dce8 call_user_handler (f, 0, 0, 0, fce40200, fcdfba70) + 3b8 > fee3ded0 sigacthandler (f, 0, fcdfba70, 0, 0, 0) + 60 > --- called from signal handler with signal 15 (SIGTERM) --- > fee4cdc0 __pollsys (fcdfbde8, 0, fcdfbe50, 0, 0, 0) + 8 > fede8590 pselect (fcdfbde8, feeb4728, feeb4728, 0, fcdfbe50, 0) + 1c8 > fede8908 select (0, 0, 0, 0, fcdfbeb8, f4240) + a0 > ff087d20 apr_sleep (0, 186a0, a0a84, a0a80, 0, 0) + 4c > fe3e3030 wd_worker (fe3f9864, 3a40f8, 1, fcdfbf38, 5a1e9, d645b26e) + 344 > ff087274 dummy_worker (3a5790, fcdfc000, 0, 0, ff087268, 1) + c > fee494f0 _lwp_start (0, 0, 0, 0, 0, 0) > > Also crash in mod_watchdog but in a separet stack: > > ----------------- lwp# 1 / thread# 1 -------------------- > ff29f134 apr_pool_destroy (3524f8, 200, ffbfef98, 34ba68, 2000, 199c40) + 14c > fef629e0 clean_child_exit (7, 22f, 3, 3, 9, cc858) + 60 > fef62f2c child_main (fef7b93c, fef7b938, 9cac4, fef7b954, fef7b944, 9c274) + > 344 > fef635fc make_child (cc858, 6, 6, 3520c8, 1, 0) + 1d0 > fef645e4 prefork_run (0, ffbff160, ffbff148, fef7b94c, 9c274, fef7b95c) + 91c > 0003a9c4 ap_run_mpm (a76e0, ce3b0, cc858, 9c0e4, 0, 1d30d0) + 54 > 00076224 main (385b4, 9babc, 77168, 9c274, 9c260, a5768) + 9b4 > 00032200 _start (0, 0, 0, 0, 0, 0) + 5c > ----------------- lwp# 2 / thread# 2 -------------------- > ff042480 mutex_lock_impl (fcee0200, 0, 0, 0, fd6b7658, ff042a78) + 168 > fd6a65d8 __deregister_frame_info_bases (fd6b7688, 0, 0, 202, fd6b7670, 0) + d8 > fd6a0d80 ???????? (0, 1, fd6b7680, fd6b7a08, 0, fd6b7a0c) > fd6a6b20 _fini (ff3f418c, ff3f5b10, 2ae70, 0, ff3f48e8, 1821) + 4 > ff3c5a5c call_fini (ff3f418c, fe5e0018, fd6a6b1c, ff3f4380, ff3f4338, > ff3f48e8) + cc > ff3c5c2c atexit_fini (ff3f418c, 2ed28, ff042cc0, ff3f48e8, fcee0200, > fe5e0018) + 78 > fefc2374 _exithandle (ff0b7500, ff0b5900, 1c00, ff0b9330, 24, 1d4088) + 40 > fefb0790 exit (0, 1d4088, ff299ec8, 0, fcee0200, 34bacc) + 4 > fef62a18 clean_child_exit (0, 0, 0, 0, 0, 0) + 98 > fef62a3c just_die (f, 0, fcffba70, 1, 0, 0) + 4 > ff04961c __sighndlr (f, 0, fcffba70, fef62a38, 0, 1) + c > ff03dce8 call_user_handler (f, 0, 0, 0, fcee0200, fcffba70) + 3b8 > ff03ded0 sigacthandler (f, 0, fcffba70, 0, 0, 0) + 60 > --- called from signal handler with signal 15 (SIGTERM) --- > ff04cdc0 __pollsys (fcffbde8, 0, fcffbe50, 0, 0, 0) + 8 > fefe8590 pselect (fcffbde8, ff0b4728, ff0b4728, 0, fcffbe50, 0) + 1c8 > fefe8908 select (0, 0, 0, 0, fcffbeb8, f4240) + a0 > ff2ab9fc apr_sleep (0, 186a0, a1644, a1640, 0, 0) + 4c > fe573030 wd_worker (fe589864, 34e4f8, 1, fcffbf38, 5a1ee, 6331aff1) + 344 > ff2aaf60 dummy_worker (3501b8, fcffc000, 0, 0, ff2aaf54, 1) + c > ff0494f0 _lwp_start (0, 0, 0, 0, 0, 0)
