Atanas Bakalov wrote:

I have the following problem.
I have compiled httpd-2.0.48 with the worker MPM.
RedHat 9,2.4.20-28.9 kernel
When I start Apache with "apachectl start" I have one process created , which creates two childs.

okay so far


One of these childs creates all my threads, while the other doesn't create anything.

one of these child processes is a worker MPM child process which handles client connections, and it creates worker threads as specified by the ThreadsPerChild directive in httpd.conf


the other child process is probably the single-threaded mod_cgid daemon process

okay so far

So the problem is that if for some reason my module crashes , the whole process
that have created all that threads is killed , and a new child is created .
That child also creates all the threads again perfectly.

yes, that's how it works; the entire child process exits when your module crashes


okay so far

I can see with the help of log lines in my module how I enter initialize_child
function , each time the process who owns all the threads is killed/created.
!!!ap_hook_child_init (initialize_child, NULL, NULL, APR_HOOK_MIDDLE );

okay so far


But here is the main problem.If that child is once killed I'm not able to connect to
Apache on 443 anymore.However when I type "fuser -n tcp 443" I can see
that the new child process has succesifully bind that port.

actually, the child process doesn't bind to the port; instead it inherits the listening socket from the parent


The problem can be reproduced , if I just "kill -9 1234" , where 1234 is the pid of the
process owner of the threads.

if you built with ngpt active (i.e., if you didn't do LD_ASSUME_KERNEL=2.2.5 or whatever), I think APR's default thread mechanism is pthread, and when a child process holding a pthread mutex crashes, the mutex ownership is lost and the replacement child process will not be able to obtain the accept mutex and your server is hung


put "AcceptMutex sysvsem" in httpd.conf, restart, and do the test again

(it's quite fun to build on RH9/FC1 without LD_ASSUME_KERNEL=2.2.5 and then try to run the build with LD_ASSUME_KERNEL=2.2.5; won't start because the default mutex mechanism is cross-process pthread and no such thing with LD_ASSUME_KERNEL=2.2.5)

<disclaimer>
It has been a while since I played on RH9. I'm assuming that it worked the same way there that it does with FC1. You can verify my assessment with httpd -V. If it displays "-D APR_USE_PROC_PTHREAD_SERIALIZE" then the mutex type is what is biting you.
</disclaimer>


Reply via email to