Hi Willy,
I can reproduce this on every request that contains a header that matches the
regex of the rspidel rule.
This is the output of the debug logging in the patch you sent.
0: cur_next=+17 used=16 buf.p=0x7fd1c18eb414 buf.size=16384 buf.p=+0 buf.o=0
buf.i=7450
1: old_idx=0 cur_idx=1 ptr=0x7fd1c18eb425 end=+27 next=+29 buf_end=+7433
1: old_idx=1 cur_idx=3 ptr=0x7fd1c18eb442 end=+38 next=+40 buf_end=+7404
1: old_idx=3 cur_idx=4 ptr=0x7fd1c18eb46a end=+55 next=+57 buf_end=+7364
2: old_idx=3 cur_idx=4 next_idx=5 used=16 delta=-57 ptr=0x7fd1c18eb46a next=+0
buf_end=+7307
1: old_idx=3 cur_idx=5 ptr=0x7fd1c18eb46a end=+27 next=+29 buf_end=+7307
1: old_idx=5 cur_idx=6 ptr=0x7fd1c18eb487 end=+23 next=+25 buf_end=+7278
1: old_idx=6 cur_idx=7 ptr=0x7fd1c18eb4a0 end=+23 next=+25 buf_end=+7253
1: old_idx=7 cur_idx=8 ptr=0x7fd1c18eb4b9 end=+16 next=+18 buf_end=+7228
1: old_idx=8 cur_idx=9 ptr=0x7fd1c18eb4cb end=+39 next=+41 buf_end=+7210
1: old_idx=9 cur_idx=10 ptr=0x7fd1c18eb4f4 end=+136 next=+138 buf_end=+7169
1: old_idx=10 cur_idx=11 ptr=0x7fd1c18eb57e end=+57 next=+59 buf_end=+7031
1: old_idx=11 cur_idx=12 ptr=0x7fd1c18eb5b9 end=+119 next=+121 buf_end=+6972
1: old_idx=12 cur_idx=13 ptr=0x7fd1c18eb632 end=+28 next=+30 buf_end=+6851
1: old_idx=13 cur_idx=15 ptr=0x7fd1c18eb650 end=+24 next=+26 buf_end=+6821
1: old_idx=15 cur_idx=16 ptr=0x7fd1c18eb66a end=+26 next=+28 buf_end=+6795
1: old_idx=16 cur_idx=17 ptr=0x7fd1c18eb686 end=+22 next=+24 buf_end=+6767
And his is the output of "bt full" from gdb against a core dump from the
patched build
#0 0x000000010cbdef40 in conn_free (conn=0x7fd1c1451d80) at connection.h:520
520 pool_free2(pool2_connection, conn);
(gdb) bt full
#0 0x000000010cbdef40 in conn_free (conn=0x7fd1c1451d80) at connection.h:520
No locals.
#1 0x000000010cbcf132 in si_release_endpoint (si=0x7fd1c1451b08) at
stream_interface.h:126
conn = (struct connection *) 0x7fd1c1451d80
appctx = (struct appctx *) 0x10cbc1113
#2 0x000000010cbceb02 in http_end_txn_clean_session (s=0x7fd1c1451880) at
src/proto_http.c:4377
No locals.
#3 0x000000010cbcfe48 in http_resync_states (s=0x7fd1c1451880) at
src/proto_http.c:4766
txn = (struct http_txn *) 0x7fd1c14518c8
old_req_state = 33
old_res_state = 33
#4 0x000000010cbd611e in http_response_forward_body (s=0x7fd1c1451880,
res=0x7fd1c144fce0, an_bit=1048576) at src/proto_http.c:6082
tmpbuf = (struct buffer *) 0x7fd1c18eb400
txn = (struct http_txn *) 0x7fd1c14518c8
msg = (struct http_msg *) 0x7fd1c14518d8
bytes = 32721
compressing = 0
consumed_data = 6741
ret = 33
#5 0x000000010cc0c30e in process_session (t=0x7fd1c144f940) at
src/session.c:2012
max_loops = 199
ana_list = 1048576
ana_back = 1048576
flags = 2
srv = (struct server *) 0x7fd1c1864200
s = (struct session *) 0x7fd1c1451880
rqf_last = 8421376
rpf_last = 0
rq_prod_last = 7
rq_cons_last = 7
rp_cons_last = 7
rp_prod_last = 7
req_ana_back = 8192
#6 0x000000010cb85403 in process_runnable_tasks (next=0x7fff5308f82c) at
src/task.c:238
t = (struct task *) 0x7fd1c144f940
eb = (struct eb32_node *) 0x0
max_processed = 0
expire = 1303700686
#7 0x000000010cb7483e in run_poll_loop () at src/haproxy.c:1278
next = 1303700686
#8 0x000000010cb754c1 in main (argc=6, argv=0x7fff5308fa50) at
src/haproxy.c:1609
err = 0
retry = 200
limit = {
rlim_cur = 2062,
rlim_max = 2062
}
errmsg =
"\000?\bS?\000\000?\023wl?\000\0008?\bS?\000\0008?\bS?\000\000\000\000?\f\001\000\000\000\r\000\000\000\f\000\000\000@\awl?\000\000\000\000?\f\000\000\000\000?\005wl?\000\000\025\000\000\000?\000\000
?\bS?", '\0' <repeats 13 times>
pidfd = -1
Current language: auto; currently minimal
Here is the config I'm using again
global
daemon
quiet
maxconn 1024
pidfile haproxy.pid
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
defaults
log global
balance roundrobin
mode http
http-check send-state
retries 3
timeout connect 6000
timeout client 1020000
timeout server 1020000
timeout http-request 6000
option abortonclose
option forwardfor except 127.0.0.1
option http-pretend-keepalive
option http-server-close
option httplog
option log-health-checks
option log-separate-errors
option redispatch
option tcpka
errorfile 200 errorfiles/200.http
errorfile 400 errorfiles/400.http
errorfile 403 errorfiles/403.http
errorfile 408 errorfiles/408.http
errorfile 500 errorfiles/500.http
errorfile 502 errorfiles/502.http
errorfile 503 errorfiles/503.http
listen stats :7000
mode http
stats uri /
frontend external
bind :80
maxconn 1024
rspidel ^X-Frame-Options:.*
default_backend migw
backend migw
option httpchk GET /online
server migw :8081 check port 48080
On 1 Jan 2014, at 09:28, Willy Tarreau <[email protected]> wrote:
> Hi guys,
>
> On Tue, Dec 31, 2013 at 10:17:30AM -0600, Kevin wrote:
>> I think that is the same bug I ran into.
>>
>> I was unable to get the debug tools working sufficiently to track it down
>> myself, and Willy was unable to use my debug file to find it either.
>>
>> He did point me toward a clever workaround of changing the header instead of
>> deleting it and that seems to work. In my case changing Content-Length to
>> Xontent-Length.
>>
>>>> rsprep ^Content-Length:(.*) Xontent-Length:\1 if is_304
>>
>>
>> The other thing that worked for me was using the built in regular
>> expressions library instead of PCRE.
>
> I'm really starting to think this could be a PCRE bug on OSX. The
> only two reports of this crash are users of PCRE on OSX, and looking
> at the code, I can't imagine any reason for this to happen. The same
> code is executed for header renaming and removal. The only difference
> is that when the header is removed, regexec() is called again with a
> pointer to the same location (but different content). So maybe there
> is an improperly initialized pointer somewhere in PCRE which randomly
> makes it fail when multiple contents are passed in turn ? I don't really
> know.
>
> Another strange point is that if a memory corruption happened because
> of the header removal (eg: wrong length calculation), it should be
> independant on the regex lib used (since it does not use anything
> returned by regexec). And the fact that changing to the builtin regex
> fixes the issue tends to fuel the theory of a PCRE bug.
>
> If you can easily reproduce it without too much traffic, you can
> apply the attached patch and report the output.
>
> Thanks,
> Willy
>
> <debug-rspidel.diff>