Naik, Roshan wrote:
In my humble opinion it is much simpler (and cleaner) to fix this problem in one place...Apache, rather than in each module that may face this problem.
I am not sure if it is really simpler (see below) but admittedly it would make things easier for module authors.
So from the part of the Perl script I see only two approaches:
1. Guidelines for script programmers that clearly state that you have to call a special
Perl function for doing a fork inside mod_perl, because otherwise you will shoot yourself.
This is unfortunately not feasible (even though possible). The reason
being that there are so many perl modules out there that are being
widely used by programmers. You cant know which function call can possibly call a fork(). For example would you ever imagine that Perl's
Syslog module would need to fork ? Well it does... and I was really surprised too. That's actually where I noticed this whole problem.
Not only does it call fork() but it seems like its unpredictable as
to when it calls fork(). It seems to do so if we pound on it too much. Prusuing this option requires too much knowledge on part of Programmers about the internals of the various modules that they will be using.
Ok, I do understand your pain and problems better now and I understand the problems you see with my proposal. So let me try to understand your idea of a solution inside Apache a little bit better.
If I look at your patch at the end of the mail you would like to exit the process once the handler has been run. I see the following problems with that approach:
1. exit(0): This does exactly what you do not like: A function deeper down in the call stack terminates the process. Furthermore shared resources may no get freed correctly. But this may be fixable by doing it in the same way Apache processes handle SIGTERM, SIGHUP and similar signals. So I do not see a real big problem here.
2. As fork is called inside the module code Apache gets caught by this unprepared. So there may be some locks or similar things on shared resources that should be freed before a fork, especially if the fork is used to create an additional process for handling a time consuming operation in the background.
3. On some Unix OS'es a fork does not copy all threads of the original process for performance reasons. So the forked process is not an exact copy of the original process. This may lead to problems with multithreaded MPM's.
I think 2. and 3. can only be solved by having an appropriate wrapper around fork inside of Apache that does the needed preparations. But this would require that the "unknowing" module is made to call this wrapper when it calls fork. On Unix systems I think this is only possible by some ugly and maybe not portable dynamic linker voodoo. I have no idea how this can be reached on non Unix platforms.
[...]
better support from Apache API functions should be part of the further discussion.
As far as I remeber possible approaches like a special mod_fork have already been discussed by Jeff and Paul in this thread.
Yes ... and I don't fully understand the scope or the reasoning behind those. I still think handling it at Apache level is better
As far as I understand their ideas they also try to carefully prepare a fork operation. I think their approach is driven from the approaches used by mod_cgi which also forks processes, but in a big difference to your problem also executes a new program by calling an exec like call.
than handling it at content handler. That should also simplify the task of writing modules. As a nice side effect a rogue
(or na�ve) module will not be able to hang apache.
I think there are enough other approaches left for a rogue module to reach this :-).
[...]
I dont think it is good idea for a function deeper down in the call stack to try to clean up resources allocated by
functions higher up in the call stack. Mod_perl can (and should have to) only clean up resources that it has allocated.
For now most of these solutions seem to be aspirational and nothing
concrete likely to materialize soon. For now I would like to propose the solution 2 ( in my original email) as a patch. The idea is to invoke exit(0) (if we are in the forked worker) just after ap_run_handler is invoked by ap_invoke_handler....
AP_CORE_DECLARE(int) ap_invoke_handler(request_rec *r) { // ...snip....
result = ap_run_handler(r);
if ( I am a forked worker ) { exit(0); // terminate at the earliest possible stage after request was processed
}
// ...snip.... }
This solution is a band aid to fix the current shortcoming in design. We
can get rid of this when (if ever) a full fledged solution is worked out. I feel it is useful to have this band-aid to stop the bleeding till we take it to the hospital.
Sorry, for sounding too sarcastic, simply could not resist: I am not quite sure if the patient will reach the hospital alive with this band aid ;-).
Regards
R�diger
