Hello,

I've been trying to track down a couple of stability problems in 2.0's
mod_proxy.  Both problems have to do with a misbehaving downstream server.
I have seen these issues in the public 2.0.28 release as well as the
latest CVS extract.  I've traced them as far as we can, but now I need
some help.

The configuration:

    The build is on redhat linux 7.2 with a 2.4.2 kernel and the default
    prefork mpm.  Proxying is triggered via a rewrite rule.


1) Run away process.

When the downstream server kills the connection to apache without sending
any output, apache goes into a run away loop that consumes all CPU.  I've
been able to narrow the loop down to ap_proxy_string_read in prox_util.c:

    /* loop through each brigade */
    while (!found) {

        /* get brigade from network one line at a time */
        if (APR_SUCCESS != (rv = ap_get_brigade(c->input_filters, bb, 
AP_MODE_BLOCKING, &readbytes))) {
            return rv;
        }

        /* loop through each bucket */
        while (!found && !APR_BRIGADE_EMPTY(bb)) {
            e = APR_BRIGADE_FIRST(bb);
            .... more stuff ...
        }
    }

This loop is endless and apache never sends out anything nor does it
log any kind of error.  The inner while condition is never true and found
is never set (found only gets set when an LF is seen and that only
happens in the inner while loop).

The gist of this piece of code seems to be to look for a LF terminated
string for as long as it can pull data.  Problem is, it will never
get anything at all.  Since the socket was shutdown by the other side
without any data ever being sent, it seems that ap_get_brigade should
fail or APR_BRIGADE_EMPTY should break the loop.  Neither seems to be happening
though and my understanding of filters and brigades isn't up to the task
of figuring this out.

Can anyone point me in the right direction here?


2) Seg fault on bogus header data.

If the downstream server sends a malformed header line (LF terminated or not),
apache will serve a 500 error and log an error.  It will also seg fault and
dump core.

Here is the stacktrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1024 (LWP 6003)]
0x4031ed32 in __libc_free (mem=0xbfffb300) at malloc.c:3043
3043    malloc.c: No such file or directory.
    in malloc.c
(gdb) where
#0  0x4031ed32 in __libc_free (mem=0xbfffb300) at malloc.c:3043
#1  0x4001c59f in heap_destroy (data=0x81b23f0) at apr_buckets_heap.c:76
#2  0x4001cec3 in apr_brigade_cleanup (data=0x81a8f00) at apr_brigade.c:86
#3  0x4001cf0c in apr_brigade_destroy (b=0x81a8f00) at apr_brigade.c:97
#4  0x080b6a5a in core_output_filter (f=0x81a7e68, b=0x81a8f00) at core.c:3346
#5  0x080af8a8 in ap_pass_brigade (next=0x81a7e68, bb=0x81a7ec8) at util_filter.c:388
#6  0x0808b5a2 in ap_http_header_filter (f=0x81adc88, b=0x81a7ec8) at 
http_protocol.c:1241
#7  0x080af8a8 in ap_pass_brigade (next=0x81adc88, bb=0x81a7ec8) at util_filter.c:388
#8  0x080b16e5 in ap_content_length_filter (f=0x81adc70, b=0x81a7ec8) at protocol.c:966
#9  0x080af8a8 in ap_pass_brigade (next=0x81adc70, bb=0x81a7ec8) at util_filter.c:388
#10 0x0808cf17 in ap_byterange_filter (f=0x81adc58, bb=0x81a7ec8) at 
http_protocol.c:2371
#11 0x080af8a8 in ap_pass_brigade (next=0x81adc58, bb=0x81a7ec8) at util_filter.c:388
#12 0x08078112 in ap_proxy_http_process_response (p=0x81a7af8, r=0x81ac370, 
p_conn=0x81a7f18, origin=0x81a80d8, backend=0x81a7f30, conf=0x819c368, bb=0x81a7ec8, 
server_portstr=0xbfffd360 "") at proxy_http.c:858
#13 0x0807849b in ap_proxy_http_handler (r=0x81ac370, conf=0x819c368, url=0x81a7ff0 
"/", proxyname=0x0, proxyport=0) at proxy_http.c:999
#14 0x08070ee4 in proxy_run_scheme_handler (r=0x81ac370, conf=0x819c368, url=0x81adc3e 
"http://127.0.0.1:6666/";, proxyhost=0x0, proxyport=0) at mod_proxy.c:985
#15 0x0807011f in proxy_handler (r=0x81ac370) at mod_proxy.c:456
#16 0x080a4a78 in ap_run_handler (r=0x81ac370) at config.c:185
#17 0x080a4fca in ap_invoke_handler (r=0x81ac370) at config.c:360
#18 0x0808d9e2 in ap_process_request (r=0x81ac370) at http_request.c:292
#19 0x08089b19 in ap_process_http_connection (c=0x81a7c10) at http_core.c:280
#20 0x080ade9c in ap_run_process_connection (c=0x81a7c10) at connection.c:84
#21 0x080ae165 in ap_process_connection (c=0x81a7c10) at connection.c:229
#22 0x080a3707 in child_main (child_num_arg=0) at prefork.c:706
#23 0x080a37b9 in make_child (s=0x81a13b0, slot=0) at prefork.c:742
#24 0x080a38c7 in startup_children (number_to_start=5) at prefork.c:819
#25 0x080a3c2b in ap_mpm_run (_pconf=0x80eb848, plog=0x812b948, s=0x81a13b0) at 
prefork.c:1018
#26 0x080a8edd in main (argc=2, argv=0xbffff73c) at main.c:461
#27 0x402bb177 in __libc_start_main (main=0x80a8898 <main>, argc=2, ubp_av=0xbffff73c, 
init=0x806309c <_init>, fini=0x80c08a0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, 
stack_end=0xbffff72c) at ../sysdeps/generic/libc-start.c:129


Any pointers on chasing these down would be much appreciated!

thanks,

-adam

-- 

        "I believe in Kadath in the cold waste, and Ultima Thule. But you
         cannot prove to me that Harvard Law School actually exists."
                        - Theodora Goss

        "I'm not like that, I have a cat, I don't need you.. My cat, and
         about 18 lines of bourne shell code replace you in life."
                        - anonymous


Adam Sussman    

[EMAIL PROTECTED]

Reply via email to