https://issues.apache.org/bugzilla/show_bug.cgi?id=39079
Patrick Higgins <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW --- Comment #29 from Patrick Higgins <[email protected]> 2009-10-06 16:38:42 PDT --- I'm also seeing a problem that I believe is the same as this one. I think this may be related to bug 44402, but I am not sure and have not been able to apply the patch yet because of fear of losing support from our enterprise distribution vendor. The later comments by Henrik K and Huo Mingyu about the problem persisting in 2.2.11 make me think it may be separate from 44402, but since they did not post additional information, I am not sure that they are really seeing this problem or perhaps a different one. I am running the Red Hat httpd-2.2.3-22.el5 version on two servers that jointly serve around 8 million hits per day. I see around 15 segfaults per day, and so far I've seen two processes enter and endless loop in the two weeks I have been using the worker MPM. I am using the following modules: LoadModule alias_module modules/mod_alias.so LoadModule authz_host_module modules/mod_authz_host.so LoadModule deflate_module modules/mod_deflate.so LoadModule expires_module modules/mod_expires.so LoadModule headers_module modules/mod_headers.so LoadModule jk_module modules/mod_jk.so LoadModule log_config_module modules/mod_log_config.so LoadModule mime_module modules/mod_mime.so LoadModule proxy_http_module modules/mod_proxy_http.so LoadModule proxy_module modules/mod_proxy.so LoadModule rewrite_module modules/mod_rewrite.so LoadModule security2_module modules/mod_security2.so LoadModule setenvif_module modules/mod_setenvif.so LoadModule ssl_module modules/mod_ssl.so LoadModule status_module modules/mod_status.so LoadModule unique_id_module modules/mod_unique_id.so I have not used the log_forensic_module to get better reporting on the segfaults, but the two hung processes (three looping threads total) have all been serving requests in a space that is reverse proxied through mod_proxy_http with rewrite rules. The unusual modules of note that I am loading are mod_security2 and mod_jk. Both of those were compiled in-house, and are versions 2.5.9 and 1.2.28, respectively. I have not captured any core files yet, but have taken stack traces from the two processes in the endless loop. I used "gstack <pid>" to get the traces. I took three with a pause of a few seconds between each. The complete stack traces follow (there really were only 3 and 4 threads total), along with a unified diff showing what changed between the three traces I took: Week 2 trace: Thread 3 (Thread 0x428fc940 (LWP 6559)): #0 0x00002ae823ee85f2 in select () from /lib64/libc.so.6 #1 0x00002ae8237f46d5 in apr_sleep () from /usr/lib64/libapr-1.so.0 #2 0x00002ae8278dcd6f in jk_watchdog_func () #3 0x00002ae823a02367 in start_thread () from /lib64/libpthread.so.0 #4 0x00002ae823eef0ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x4fb11940 (LWP 6581)): #0 0x00002ae821f1da9f in ap_core_input_filter () from /proc/6558/exe #1 0x00002ae821f2d90a in ap_http_filter () from /proc/6558/exe #2 0x00002ae827f4300a in ?? () #3 0x00002ae827f44f16 in ?? () #4 0x00002ae827d31cb1 in proxy_run_scheme_handler () #5 0x00002ae827d357f3 in ?? () #6 0x00002ae821f1e38a in ap_run_handler () from /proc/6558/exe #7 0x00002ae821f21802 in ap_invoke_handler () from /proc/6558/exe #8 0x00002ae821f2bfc8 in ap_process_request () from /proc/6558/exe #9 0x00002ae821f29200 in ?? () from /proc/6558/exe #10 0x00002ae821f255f2 in ap_run_process_connection () from /proc/6558/exe #11 0x00002ae821f30a37 in ?? () from /proc/6558/exe #12 0x00002ae823a02367 in start_thread () from /lib64/libpthread.so.0 #13 0x00002ae823eef0ad in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x2ae826528560 (LWP 6558)): #0 0x00002ae823a03655 in pthread_join () from /lib64/libpthread.so.0 #1 0x00002ae8237f3d85 in apr_thread_join () from /usr/lib64/libapr-1.so.0 #2 0x00002ae821f301b0 in ?? () from /proc/6558/exe #3 0x00002ae821f30d79 in ?? () from /proc/6558/exe #4 0x00002ae821f30ef5 in ?? () from /proc/6558/exe #5 0x00002ae821f3173b in ap_mpm_run () from /proc/6558/exe #6 0x00002ae821f0b7e0 in main () from /proc/6558/exe Week 2 diff: --- 6558.1 2009-10-06 14:37:51.000000000 -0600 +++ 6558.2 2009-10-06 14:37:51.000000000 -0600 @@ -5,7 +5,7 @@ #3 0x00002ae823a02367 in start_thread () from /lib64/libpthread.so.0 #4 0x00002ae823eef0ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x4fb11940 (LWP 6581)): -#0 0x00002ae821f1da9f in ap_core_input_filter () from /proc/6558/exe +#0 0x00002ae821f1daab in ap_core_input_filter () from /proc/6558/exe #1 0x00002ae821f2d90a in ap_http_filter () from /proc/6558/exe #2 0x00002ae827f4300a in ?? () #3 0x00002ae827f44f16 in ?? () The third trace was identical to the second. Week 1 trace: Thread 4 (Thread 0x4142a940 (LWP 7213)): #0 0x00002b593d3ea5f2 in select () from /lib64/libc.so.6 #1 0x00002b593ccf66d5 in apr_sleep () from /usr/lib64/libapr-1.so.0 #2 0x00002b5940dded6f in jk_watchdog_func () #3 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #4 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x4282c940 (LWP 7215)): #0 0x00002b593cceca80 in apr_allocator_free () from /usr/lib64/libapr-1.so.0 #1 0x00002b593bf29c57 in ?? () from /usr/lib64/libaprutil-1.so.0 #2 0x00002b593bf2a7c0 in apr_brigade_cleanup () #3 0x00002b593b41f307 in ap_core_output_filter () from /proc/7212/exe #4 0x00002b593b42df0d in ap_process_request () from /proc/7212/exe #5 0x00002b593b42b200 in ?? () from /proc/7212/exe #6 0x00002b593b4275f2 in ap_run_process_connection () from /proc/7212/exe #7 0x00002b593b432a37 in ?? () from /proc/7212/exe #8 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #9 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x43c2e940 (LWP 7217)): #0 0x00002b593bf2a92e in apr_brigade_length () #1 0x00002b593b42f77e in ap_http_filter () from /proc/7212/exe #2 0x00002b594144500a in ?? () #3 0x00002b5941446f16 in ?? () #4 0x00002b5941233cb1 in proxy_run_scheme_handler () #5 0x00002b59412377f3 in ?? () #6 0x00002b593b42038a in ap_run_handler () from /proc/7212/exe #7 0x00002b593b423802 in ap_invoke_handler () from /proc/7212/exe #8 0x00002b593b42dfc8 in ap_process_request () from /proc/7212/exe #9 0x00002b593b42b200 in ?? () from /proc/7212/exe #10 0x00002b593b4275f2 in ap_run_process_connection () from /proc/7212/exe #11 0x00002b593b432a37 in ?? () from /proc/7212/exe #12 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #13 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x2b593fa2a560 (LWP 7212)): #0 0x00002b593cf05655 in pthread_join () from /lib64/libpthread.so.0 #1 0x00002b593ccf5d85 in apr_thread_join () from /usr/lib64/libapr-1.so.0 #2 0x00002b593b4321b0 in ?? () from /proc/7212/exe #3 0x00002b593b432d79 in ?? () from /proc/7212/exe #4 0x00002b593b432ef5 in ?? () from /proc/7212/exe #5 0x00002b593b43373b in ap_mpm_run () from /proc/7212/exe #6 0x00002b593b40d7e0 in main () from /proc/7212/exe Week 2 diff 1: --- 7212.1 2009-09-24 14:22:42.000000000 -0600 +++ 7212.2 2009-09-24 14:25:13.000000000 -0600 @@ -5,7 +5,7 @@ #3 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #4 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x4282c940 (LWP 7215)): -#0 0x00002b593cceca80 in apr_allocator_free () from /usr/lib64/libapr-1.so.0 +#0 0x00002b593cceca84 in apr_allocator_free () from /usr/lib64/libapr-1.so.0 #1 0x00002b593bf29c57 in ?? () from /usr/lib64/libaprutil-1.so.0 #2 0x00002b593bf2a7c0 in apr_brigade_cleanup () #3 0x00002b593b41f307 in ap_core_output_filter () from /proc/7212/exe @@ -16,7 +16,7 @@ #8 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #9 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x43c2e940 (LWP 7217)): -#0 0x00002b593bf2a92e in apr_brigade_length () +#0 0x00002b593bf2a935 in apr_brigade_length () #1 0x00002b593b42f77e in ap_http_filter () from /proc/7212/exe #2 0x00002b594144500a in ?? () #3 0x00002b5941446f16 in ?? () Week 2 diff 2: --- 7212.1 2009-09-24 14:22:42.000000000 -0600 +++ 7212.3 2009-09-24 14:25:22.000000000 -0600 @@ -5,7 +5,7 @@ #3 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #4 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x4282c940 (LWP 7215)): -#0 0x00002b593cceca80 in apr_allocator_free () from /usr/lib64/libapr-1.so.0 +#0 0x00002b593cceca8c in apr_allocator_free () from /usr/lib64/libapr-1.so.0 #1 0x00002b593bf29c57 in ?? () from /usr/lib64/libaprutil-1.so.0 #2 0x00002b593bf2a7c0 in apr_brigade_cleanup () #3 0x00002b593b41f307 in ap_core_output_filter () from /proc/7212/exe @@ -16,7 +16,7 @@ #8 0x00002b593cf04367 in start_thread () from /lib64/libpthread.so.0 #9 0x00002b593d3f10ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x43c2e940 (LWP 7217)): -#0 0x00002b593bf2a92e in apr_brigade_length () +#0 0x00002b593bf2a924 in apr_brigade_length () #1 0x00002b593b42f77e in ap_http_filter () from /proc/7212/exe #2 0x00002b594144500a in ?? () #3 0x00002b5941446f16 in ?? () >From a quick look at the code, it looks like the bucket brigade functions and/or macros are looping endlessly. I don't understand that data structure so I don't know how this might happen. I can try to inspect the data structure the next time we get a process in an endless loop to see what it looks like. I was hoping that someone might already know what state the data structure would need to be in to cause an endless loop, and solving the problem would be a matter of determining how it's getting into that state. I can gather more information as needed, but making changes to these servers can be slow and requires fighting bureaucracy. It unlikely that I would be able to test anything other than small patches to the Red Hat httpd. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
