On Sep 18 2003 Stas Bekman wrote:

Apache::Test kills httpd when all the tests were completed with 'kill TERM
$pid'. Everything is fine unless one of the cleanup handlers is still running.
If that's the case apr_pool_clear segfaults with the following trace:


#0  allocator_free (allocator=0x885c110, node=0x0) at apr_pools.c:361
361             next = node->next;
(gdb) where
#0  allocator_free (allocator=0x885c110, node=0x0) at apr_pools.c:361
#1  0x402675f4 in apr_pool_clear (pool=0x8aa86a0) at apr_pools.c:738
#2  0x080c1549 in child_main (child_num_arg=142983440) at prefork.c:613
#3  0x080c1821 in make_child (s=0x811ca60, slot=0) at prefork.c:788
#4  0x080c189e in startup_children (number_to_start=1) at prefork.c:806
#5  0x080c1f0d in ap_mpm_run (_pconf=0x8117a78, plog=0x815db90, s=0x0)
    at prefork.c:1022
#6  0x080c712a in main (argc=7, argv=0xbffff3a4) at main.c:660
#7  0x4034dc57 in __libc_start_main () from /lib/i686/libc.so.6

first of all it shouldn't segfault.

second, shouldn't it wait for the cleanup handler to finish? Consider that the
cleanup handler is doing some critical job? I've noticed this problem with
Apache::Test but it doesn't go away if you run the normal server...

Joe has followed up on this via http://nagoya.apache.org/bugzilla/show_bug.cgi?id=23238

------- Additional Comments From [EMAIL PROTECTED]  2004-01-25 17:56 -------
The problem is simply that the SIGTERM handler invokes apr_pool_clear via
child_main_exit, and apr_pool_clear is not async-signal-safe by any means.  So
if the child was already doing some pool operation when the signal handler was
invoked, all bets are off.

(1.3 ignored alarms during pool operations to avoid this)

-----------------------------------

So how do we solve this problem? Running cleanups could be very crucial on some web services and not being able to ensure their execution completion is a very bad problem.

Shouldn't apr_pool_clear set to ignore SIGTERMs and restore the handler at the end of its run? I suppose this will lose the signal if it's coming in the middle of clear run. So may be a handler that will remember that the signal was sent and then re-throw it once clear is done?

But it's more than that. What if apr_pool_clear hasn't even started yet when SIGTERM has arrived? In this case we deterministically lose all cleanup functionality.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com



Reply via email to