On Wed, Jul 7, 2021 at 12:16 PM Stefan Eissing <[email protected]> wrote: > > I added a TRACE1 log in event.c before/after join_workers (line 2921) and see: > > [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid > 4493635072] event.c(2921): graceful termination received, join workers > [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid > 123145435672576] event.c(1799): All workers are busy or dying, will close 1 > keep-alive connections > [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid > 123145435672576] event.c(1799): All workers are busy or dying, will close 0 > keep-alive connections > [Wed Jul 07 10:06:10.787777 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] > AH00046: child process 72886 still did not exit, sending a SIGKILL > > So, I assume the keep-alive connection is the mod_proxy_http connection to > the sever itself. Since the join_workers() never returns, there seems to be a > thread not finishing.
No, MPM event doesn't handle mod_proxy kept alive connections, those are client connections. > > On another run, I got a stacktrace of the child: > > Call graph: > 264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) > + 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] > + 264 main (in httpd) + 2278 [0x104729b86] main.c:862 > + 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 > + 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] > event.c:3398 > + 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] > event.c:2997 > + 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] > event.c:2924 > + 264 join_workers (in mod_mpm_event.so) + 386 > [0x10502cc72] event.c:2717 > + 264 apr_thread_join (in libapr-2.0.dylib) + 44 > [0x1048b347c] thread.c:256 > + 264 _pthread_join (in libsystem_pthread.dylib) + 362 > [0x7fff20312f60] > + 264 __ulock_wait (in libsystem_kernel.dylib) + 10 > [0x7fff202dd9ee] OK, that's where the child waits for worker threads. > 264 Thread_36766548 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766549 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766550 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766551 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] The above mod_http2 threads are not accounted for by the MPM I suppose (they should be stopped by the future child_exit hook, but still the MPM will only care about its own worker threads, i.e. those running the process_connection hook). > 264 Thread_36766565 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 worker_thread (in mod_mpm_event.so) + 380 [0x10502cf2c] > event.c:2340 > + 264 ap_queue_pop_something (in httpd) + 142 [0x10473cf3e] > mpm_fdqueue.c:447 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] This worker didn't get the signal to stop from ap_queue_term().. > 264 Thread_36766579 > 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > 264 listener_thread (in mod_mpm_event.so) + 1129 [0x10502e069] > event.c:1940 > 264 impl_pollset_poll (in libapr-2.0.dylib) + 114 > [0x1048aec82] kqueue.c:272 > 264 kevent (in libsystem_kernel.dylib) + 10 [0x7fff202e0c4a] .. and that's because the listener didn't exit either, hmm. Your system sets listener_is_wakeable (AH02471), right? Does wakeup_listener() get called for the graceful restart that hangs? Cheers; Yann.
