Hi Willy,

Op 12-10-2015 om 7:28 schreef Willy Tarreau:
Hi Pieter,

On Mon, Oct 12, 2015 at 01:22:48AM +0200, PiBa-NL wrote:
#1  0x0000000000417388 in buffer_slow_realign (buf=0x7d3c90) at
src/buffer.c:166
           block1 = -3306
           block2 = 0
I'm puzzled by this above, no block should have a negative size.

#2  0x0000000000480c42 in http_wait_for_request (s=0x80247d600,
req=0x80247d610, an_bit=4)
       at src/proto_http.c:2686
           cur_idx = -6336
           sess = (struct session *) 0x80241e400
           txn = (struct http_txn *) 0x802bb2140
           msg = (struct http_msg *) 0x802bb21a0
           ctx = {line = 0x2711079 <Address 0x2711079 out of
bounds>, idx = 3, val = 0, vlen = 7, tws = 0, del = 33, prev = 0}
And this above, similarly cur_idx shouldn't be negative.

Seems that buffer_slow_realign() isn't used regularly during normal
haproxy operation, and it crashes first time that specific function
gets called.
Reproduction is pretty consistent with chrome browser refreshing stats
every second.
Then starting: wrk -c 200 -t 2 -d 10 http://127.0.0.1:801/
I tried adding some Alert(); items in the code to see what parameters
are set at what step, but am not understanding the exact internals of
that code..

This negative bh=-7800 is not supposed to be there i think? It is from
one of the dprintf statements, how are those supposed generate output?..
[891069718] http_wait_for_request: stream=0x80247d600 b=0x80247d610,
exp(r,w)=0,0 bf=00c08200 bh=-7800 analysers=34

Anything else i can check or provide to help get this fixed?

Best regards,
PiBa-NL
Just a little 'bump' to this issue..

Anyone know when/how this buffer_slow_realign() is suppose to work?
Yes, it's supposed to be used only when a request or response is wrapped
in the request or response buffer. It uses memcpy(), hence the "slow"
aspect of the realign.

I suspect it either contains a bug, or is called with bogus parameters..
It's very sensitive to the consistency of the buffer being realigned. So
errors such as buf->i + buf->o > buf->size, or buf->p > buf->data + buf->size,
or buf->p < buf->data etc... can lead to crashes. But these must never happen
at all otherwise it proves that there's a bug somewhere else.

Here since block1 is -3306 and block2 = 0, I suspect that they were assigned
at line 159 from buf->i, which definitely means that the buffer was already
corrupted.

How can we/i determine which it is?
The difficulty consists in finding what can lead to a corrupted buffer :-/
In the past we had such issues when trying to forward more data than was
available in the buffer, due to option send-name-header. I wouldn't be
surprized that it can happen here on corner cases when building a message
from Lua if the various message pointers are not all perfectly correct.

Even though with a small change in the config (adding a backend) i cant
reproduce it, that doesnt mean there isn't a problem with the fuction..
As the whole function doesn't seem to get called in that circumstance..
It could be related to an uninitialized variable somewhere as well. You
can try to start haproxy with "-dM" to see if it makes the issues 100%
-dM doesnt seem to make much difference if any i this case..
reproducible or not. This poisons all buffers (fills them with a constant
byte 0x50 after malloc) so that we don't rely on an uninitialized zero byte
somewhere.

Regards,
Willy

Been running some more tests with the information that req->buf->i should be >= 0.

What i find is that after 1 request i already see rqh=-103 , it seems like the initial request size which in this case is also is 103 bytes is subtracted twice? It does not immediately crash, but if this is already a sign of 'corruption' then the cause should be a little more easy to find..

@Willy can you confirm this indicates the problem could be in progress of heading to a crash? Even though in the last line it restores to 0..

See its full output below replaced the DPRINTF already in code of stream.c with Alert..

Thanks in advance,
PiBa-NL

root@OPNsense:/usr/ports/net/haproxy-devel # haproxy -f /var/haproxy.config -d
[ALERT] 277/063055 (61489) : SSLv3 support requested but unavailable.
Available polling systems :
     kqueue : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result FAILED
Total: 3 (2 usable), will use kqueue.
Using kqueue() as the polling mechanism.
00000000:http_frt.accept(0006)=0008 from [127.0.0.1:62358]
[ALERT] 277/063058 (61489) : [910446771] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x00000080, rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,0 rqf=00d08000 rpf=80000000 rqh=0 rqt=0 rph=0 rpt=0 cs=7 ss=0, cet=0x0 set=0x0 retr=0 [ALERT] 277/063058 (61489) : [910446771] http_wait_for_request: stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00d08000 bh=0 analysers=34 [ALERT] 277/063058 (61489) : [910446772] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x00000080, rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,0 rqf=00000002 rpf=80000000 rqh=103 rqt=0 rph=0 rpt=0 cs=7 ss=0, cet=0x0 set=0x0 retr=0 [ALERT] 277/063058 (61489) : [910446772] http_wait_for_request: stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00808002 bh=103 analysers=34
00000000:http_frt.clireq[0008:ffffffff]: GET / HTTP/1.1
00000000:http_frt.clihdr[0008:ffffffff]: Host: 127.0.0.1:801
00000000:http_frt.clihdr[0008:ffffffff]: Accept: */*
00000000:http_frt.clihdr[0008:ffffffff]: User-Agent: fetch libfetch/2.0
00000000:http_frt.clihdr[0008:ffffffff]: Connection: close
[ALERT] 277/063058 (61489) : [910446772] process_switching_rules: stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00808002 bh=103 analysers=00 [ALERT] 277/063058 (61489) : [910446772] sess_prepare_conn_req: sess=0x80247d600 rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,0 rqf=00808002 rpf=80000000 rqh=0 rqt=103 rph=0 rpt=0 cs=7 ss=1 [ALERT] 277/063058 (61489) : [910446772] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x0000048a, rq=0x80247d610, rp=0x80247d650, exp(r,w)=910456772,0 rqf=00808200 rpf=80000023 rqh=0 rqt=0 rph=97 rpt=0 cs=7 ss=7, cet=0x0 set=0x0 retr=0
00000000:http_frt.srvrep[0008:ffffffff]: HTTP/1.1 200 OK
00000000:http_frt.srvhdr[0008:ffffffff]: content-length: 13
00000000:http_frt.srvhdr[0008:ffffffff]: content-type: text/plain
00000000:http_frt.srvhdr[0008:ffffffff]: Connection: close
[ALERT] 277/063058 (61489) : [910446773] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x0000048a, rq=0x80247d610, rp=0x80247d650, exp(r,w)=910456773,0 rqf=00808200 rpf=8004a221 rqh=-103 rqt=0 rph=0 rpt=0 cs=7 ss=7, cet=0x0 set=0x0 retr=0 [ALERT] 277/063058 (61489) : [910446774] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x0000048a, rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,0 rqf=0080c021 rpf=8004a020 rqh=-103 rqt=0 rph=0 rpt=0 cs=8 ss=7, cet=0x0 set=0x0 retr=0 [ALERT] 277/063058 (61489) : [910446774] process_stream:1655: task=0x80244b9b0 s=0x80247d600, sfl=0x0004348a, rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,910456774 rqf=0000b061 rpf=8004a020 rqh=0 rqt=0 rph=0 rpt=0 cs=9 ss=8, cet=0x0 set=0x0 retr=0
00000000:http_frt.srvcls[0008:ffffffff]
00000000:http_frt.clicls[0008:ffffffff]
00000000:http_frt.closed[0008:ffffffff]


Reply via email to