prefork mpm crashes on SIGINT [possible patch?]

2013-05-07 Thread Jan Kaluža

Hi,

to reproduce it, I start httpd-2.4.4 like this:

/usr/sbin/httpd -k start -DFOREGROUND

and in another terminal I run ab like this:

ab -n 15 -c 4 http://127.0.0.1/index.php

Index.php is just dummy script with phpinfo.

When ab is benchmarking, I kill httpd using ctrl+c. In 4/10 tries it 
crashes with following backtrace [1].
If gdb does not lie, there is race condition in signal handling. I'm not 
sure if anything changed in this code between 2.2 and 2.4, but it looks 
like 2.2 was handling this situation somehow better. I think this 
situation can lead to various problems like deadlocks.


I've been able to workaround this using the attached patch, but I don't 
think this is the best way how to fix this problem. It would be better 
to find out why there are more signals called in a row as a result of 
single SIGINT signal sent to main process.


I will try to do that myself, but I was thinking there could be people 
more familiar with this part of code and with httpd's children lifecycle 
reading this list.


[1]

#0  0x7fe74b7d2ba5 in raise () from /lib64/libc.so.6
#1  0x7fe74b7d4358 in abort () from /lib64/libc.so.6
#2  0x7fe74b81259b in __libc_message () from /lib64/libc.so.6
#3  0x7fe74b819a8e in _int_free () from /lib64/libc.so.6
#4  0x7fe74bf8e85c in allocator_free (node=0x7fe74bb4e798 
main_arena+88, allocator=0x7fe74e7ab670) at memory/unix/apr_pools.c:430

#5  apr_pool_destroy (pool=0x7fe74e7ab768) at memory/unix/apr_pools.c:856
#6  0x7fe74308c1be in clean_child_exit (code=code@entry=0) at 
prefork.c:218

#7  0x7fe74308c9cb in just_die (sig=optimized out) at prefork.c:344
#8  signal handler called
#9  0x7fe74d06d541 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#10 0x7fe74b7d5df1 in __run_exit_handlers () from /lib64/libc.so.6
#11 0x7fe74b7d5e75 in exit () from /lib64/libc.so.6
#12 0x7fe74308c1db in clean_child_exit (code=code@entry=0) at 
prefork.c:227
#13 0x7fe74308c5f5 in child_main 
(child_num_arg=child_num_arg@entry=2) at prefork.c:725
#14 0x7fe74308c94c in make_child (s=0x7fe74e5a5368, 
slot=slot@entry=2) at prefork.c:800
#15 0x7fe74308c9a6 in startup_children (number_to_start=3) at 
prefork.c:818
#16 0x7fe74308d856 in prefork_run (_pconf=optimized out, 
plog=0x7fe74e5d74f8, s=0x7fe74e5a5368) at prefork.c:976

#17 0x7fe74d2a5d4e in ap_run_mpm ()
#18 0x7fe74d29f3fa in main ()

Thanks for help,
Jan Kaluza
diff --git a/server/mpm/prefork/prefork.c b/server/mpm/prefork/prefork.c
index ae0fd37..cc7632d 100644
--- a/server/mpm/prefork/prefork.c
+++ b/server/mpm/prefork/prefork.c
@@ -212,6 +212,10 @@ static void prefork_note_child_started(int slot, pid_t pid)
 static void clean_child_exit(int code) __attribute__ ((noreturn));
 static void clean_child_exit(int code)
 {
+if (mpm_state == AP_MPMQ_STOPPING) {
+exit(code);
+}
+
 mpm_state = AP_MPMQ_STOPPING;
 
 if (pchild) {
@@ -341,6 +345,8 @@ static const char *prefork_get_name(void)
 
 static void just_die(int sig)
 {
+apr_signal(SIGHUP, 0);
+apr_signal(SIGTERM, 0);
 clean_child_exit(0);
 }
 


Re: mod_proxy seg faulting ?

2013-05-07 Thread Thomas Eckert
 However, looking at your patch, having to lock the mutex for
 ap_proxy_get_worker() looks wrong. I think it should be passed r-pool
 instead of conf-pool.

I checked how ap_proxy_get_worker() is used in other places and also what
is done with the pool inside and you are right. It really shouldn't be in a
persistent
pool. Corrected patch is attached.

 Hmm. I thought there was some introductory documentation, but I didn't
 find much. There is a short note at the end of
 http://apr.apache.org/docs/apr/1.4/apr__pools_8h.html . Basically, only
 the functions in http://apr.apache.org/docs/apr/1.4/group__apr__pools.html
 that are explicitly marked as thread safe may safely operate on a pool
 from different threads without locking.

Which is basically saying only the create-sub-pool functions ;-)


fix_mod_proxy_thread_unsafety.patch
Description: Binary data


Improve mod_proxy's error marking of workers

2013-05-07 Thread Thomas Eckert
Attached patch contains a directive to improve the error marking of
workers. Basically, some errors will cause a worker to be marked as in
error while others don't. I can't see a reason for this so I added a
directive to have all errors mark the error correctly - especially useful
for automated systems using mod_proxy to check if the system is healthy.


improve_worker_error_marking.patch
Description: Binary data


Thinking about adding a link to modules.a.o on our web site.

2013-05-07 Thread Daniel Gruno
Hi all,
I did some talking with Jim and Rich (or was it Rainer, I forget) during
ApacheCon, in which we agreed that we need to plug our modules directory
some more. I totally forgot all about this, but since it's never too
late to get something like this done, I am now contemplating adding a
link from our web site, httpd.apache.org, to modules.apache.org in the
navigation bar on the left side.

I don't see this as a big issue that needs a vote, so I'll let it simmer
for a while (standard 72 hours) and assume lazy consensus if I hear no
objections from you lot.

With regards,
Daniel.


Re: mod_proxy seg faulting ?

2013-05-07 Thread Jim Jagielski
+1.

PS: The ap_proxy_define_balancer() stuff is there for future enhancements
(creating balancers on the fly, instead of just workers).

On May 7, 2013, at 4:07 AM, Thomas Eckert thomas.r.w.eck...@gmail.com wrote:

  However, looking at your patch, having to lock the mutex for
  ap_proxy_get_worker() looks wrong. I think it should be passed r-pool
  instead of conf-pool.
 
 I checked how ap_proxy_get_worker() is used in other places and also what
 is done with the pool inside and you are right. It really shouldn't be in a 
 persistent
 pool. Corrected patch is attached.
 
  Hmm. I thought there was some introductory documentation, but I didn't
  find much. There is a short note at the end of
  http://apr.apache.org/docs/apr/1.4/apr__pools_8h.html . Basically, only
  the functions in http://apr.apache.org/docs/apr/1.4/group__apr__pools.html
  that are explicitly marked as thread safe may safely operate on a pool
  from different threads without locking.
 
 Which is basically saying only the create-sub-pool functions ;-)
 fix_mod_proxy_thread_unsafety.patch



Re: Improve mod_proxy's error marking of workers

2013-05-07 Thread Eric Covener
On Tue, May 7, 2013 at 6:24 AM, Thomas Eckert
thomas.r.w.eck...@gmail.com wrote:
 Attached patch contains a directive to improve the error marking of workers.
 Basically, some errors will cause a worker to be marked as in error while
 others don't. I can't see a reason for this so I added a directive to have
 all errors mark the error correctly - especially useful for automated
 systems using mod_proxy to check if the system is healthy.

I think you need to screen out 4xx at least to prevent client errors
from marking down your backends.


Re: Thinking about adding a link to modules.a.o on our web site.

2013-05-07 Thread Rich Bowen
Links don't imply endorsement (although maybe we need to say that explicitly), 
so I think that links are a very good thing when they help real people solve 
real problems.

So, +1

On May 7, 2013, at 6:41 AM, Daniel Gruno wrote:

 Hi all,
 I did some talking with Jim and Rich (or was it Rainer, I forget) during
 ApacheCon, in which we agreed that we need to plug our modules directory
 some more. I totally forgot all about this, but since it's never too
 late to get something like this done, I am now contemplating adding a
 link from our web site, httpd.apache.org, to modules.apache.org in the
 navigation bar on the left side.
 
 I don't see this as a big issue that needs a vote, so I'll let it simmer
 for a while (standard 72 hours) and assume lazy consensus if I hear no
 objections from you lot.
 
 With regards,
 Daniel.

-- 
Rich Bowen
rbo...@rcbowen.com
Shosholoza




Re: Thinking about adding a link to modules.a.o on our web site.

2013-05-07 Thread Gregg Smith

On 5/7/2013 3:41 AM, Daniel Gruno wrote:

Hi all,
I did some talking with Jim and Rich (or was it Rainer, I forget) during
ApacheCon, in which we agreed that we need to plug our modules directory
some more. I totally forgot all about this, but since it's never too
late to get something like this done, I am now contemplating adding a
link from our web site, httpd.apache.org, to modules.apache.org in the
navigation bar on the left side.
Isn't this already there. I see Modules in the Subprojects menu that 
leads to a page where modules.a.o is first thing mentioned.


Gregg



Re: Improve mod_proxy's error marking of workers

2013-05-07 Thread Jim Jagielski
Agreed... An all or nothing setting will likely create more
trouble than not.

On May 7, 2013, at 8:08 AM, Eric Covener cove...@gmail.com wrote:

 On Tue, May 7, 2013 at 6:24 AM, Thomas Eckert
 thomas.r.w.eck...@gmail.com wrote:
 Attached patch contains a directive to improve the error marking of workers.
 Basically, some errors will cause a worker to be marked as in error while
 others don't. I can't see a reason for this so I added a directive to have
 all errors mark the error correctly - especially useful for automated
 systems using mod_proxy to check if the system is healthy.
 
 I think you need to screen out 4xx at least to prevent client errors
 from marking down your backends.
 



Re: Improve mod_proxy's error marking of workers

2013-05-07 Thread Daniel Ruggeri
On 5/7/2013 2:00 PM, Jim Jagielski wrote:
 Agreed... An all or nothing setting will likely create more
 trouble than not.

 On May 7, 2013, at 8:08 AM, Eric Covener cove...@gmail.com wrote:

 On Tue, May 7, 2013 at 6:24 AM, Thomas Eckert
 thomas.r.w.eck...@gmail.com wrote:
 Attached patch contains a directive to improve the error marking of workers.
 Basically, some errors will cause a worker to be marked as in error while
 others don't. I can't see a reason for this so I added a directive to have
 all errors mark the error correctly - especially useful for automated
 systems using mod_proxy to check if the system is healthy.
 I think you need to screen out 4xx at least to prevent client errors
 from marking down your backends.


There is also failonstatus which, granted, could probably be enhanced to
accept ranges instead of just a list to accommodate this need.

--
Daniel Ruggeri