Hi Mark, let's continue this debugging on dev@ if you don't mind..
On Wed, Jan 31, 2018 at 10:15 PM, <[email protected]> wrote: > https://bz.apache.org/bugzilla/show_bug.cgi?id=62044 > > --- Comment #32 from [email protected] --- > so sig_coredump is being triggered by an unknown signal, multiple times a day. > It's not a segfault, nothing in /var/log/messages. That results in a bunch of > undeleted shared memory segments and probably some that will no longer be in > the global list, but still present in the kernel. In 2.4.29, i.e. without patch [1], sig_coredump might be triggered by any signal received by httpd during a restart, and the signal handle crashes itself (double fault) so the process is forcibly SIGKILLed (presumably, no trace in /var/log/messages...). This was reported and discussed in [2], and seems to quite correspond to what you observe in your tests. Moreover, if the parent process crashes nothing will delete the IPC-SysV SHMs (hence the leak in the system), while children processes may continue to be attached which prevents a new parent process to start (until children stop or are forcibly killed)... When this happens, you should see non-root processes attached to PPID 1 (e.g. with "ps -ef"), "-f /path/to/httpd.conf" in the command line might help distinguish the different httpd instances to monitor processes. If this is the case, you probably should try patch [1]. If not, I can't explain why in httpd logs a process with a different PID appears after the SIGHUP, it must have been started (automatically?) after the previous one crashed. Here the generation number can't help, a new process always start at generation #0. Regards, Yann. [1] https://svn.apache.org/repos/asf/httpd/httpd/patches/2.4.x/stop_signals-PR61558.patch [2] https://bz.apache.org/bugzilla/show_bug.cgi?id=61558
