Hi Willy,

Op 12-10-2015 om 23:06 schreef Willy Tarreau:
Hi Pieter,

On Mon, Oct 12, 2015 at 10:29:05PM +0200, PiBa-NL wrote:
Been running some more tests with the information that req->buf->i
should be >= 0.

What i find is that after 1 request i already see rqh=-103 , it seems
like the initial request size which in this case is also is 103 bytes is
subtracted twice? It does not immediately crash, but if this is already
a sign of 'corruption' then the cause should be a little more easy to
find..
Oh yes definitely, good catch!

@Willy can you confirm this indicates the problem could be in progress
of heading to a crash? Even though in the last line it restores to 0..
Absolutely. Everytime we're trying to track such a painful bug, I end
up looking for initial symptoms, like this one. In general the first
corruption is minor and undetected but once it's seeded its disease,
the problem will definitely occur.

See its full output below replaced the DPRINTF already in code of
stream.c with Alert..

Thanks in advance,
PiBa-NL

root@OPNsense:/usr/ports/net/haproxy-devel # haproxy -f
/var/haproxy.config -d
[ALERT] 277/063055 (61489) : SSLv3 support requested but unavailable.
Available polling systems :
      kqueue : pref=300,  test result OK
        poll : pref=200,  test result OK
      select : pref=150,  test result FAILED
Total: 3 (2 usable), will use kqueue.
Using kqueue() as the polling mechanism.
00000000:http_frt.accept(0006)=0008 from [127.0.0.1:62358]
[ALERT] 277/063058 (61489) : [910446771] process_stream:1655:
task=0x80244b9b0 s=0x80247d600, sfl=0x00000080, rq=0x80247d610,
rp=0x80247d650, exp(r,w)=0,0 rqf=00d08000 rpf=80000000 rqh=0 rqt=0 rph=0
rpt=0 cs=7 ss=0, cet=0x0 set=0x0 retr=0
[ALERT] 277/063058 (61489) : [910446771] http_wait_for_request:
stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00d08000 bh=0
analysers=34
[ALERT] 277/063058 (61489) : [910446772] process_stream:1655:
task=0x80244b9b0 s=0x80247d600, sfl=0x00000080, rq=0x80247d610,
rp=0x80247d650, exp(r,w)=0,0 rqf=00000002 rpf=80000000 rqh=103 rqt=0
rph=0 rpt=0 cs=7 ss=0, cet=0x0 set=0x0 retr=0
[ALERT] 277/063058 (61489) : [910446772] http_wait_for_request:
stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00808002 bh=103
analysers=34
00000000:http_frt.clireq[0008:ffffffff]: GET / HTTP/1.1
00000000:http_frt.clihdr[0008:ffffffff]: Host: 127.0.0.1:801
00000000:http_frt.clihdr[0008:ffffffff]: Accept: */*
00000000:http_frt.clihdr[0008:ffffffff]: User-Agent: fetch libfetch/2.0
00000000:http_frt.clihdr[0008:ffffffff]: Connection: close
[ALERT] 277/063058 (61489) : [910446772] process_switching_rules:
stream=0x80247d600 b=0x80247d610, exp(r,w)=0,0 bf=00808002 bh=103
analysers=00
[ALERT] 277/063058 (61489) : [910446772] sess_prepare_conn_req:
sess=0x80247d600 rq=0x80247d610, rp=0x80247d650, exp(r,w)=0,0
rqf=00808002 rpf=80000000 rqh=0 rqt=103 rph=0 rpt=0 cs=7 ss=1
[ALERT] 277/063058 (61489) : [910446772] process_stream:1655:
task=0x80244b9b0 s=0x80247d600, sfl=0x0000048a, rq=0x80247d610,
rp=0x80247d650, exp(r,w)=910456772,0 rqf=00808200 rpf=80000023 rqh=0
rqt=0 rph=97 rpt=0 cs=7 ss=7, cet=0x0 set=0x0 retr=0
00000000:http_frt.srvrep[0008:ffffffff]: HTTP/1.1 200 OK
00000000:http_frt.srvhdr[0008:ffffffff]: content-length: 13
00000000:http_frt.srvhdr[0008:ffffffff]: content-type: text/plain
00000000:http_frt.srvhdr[0008:ffffffff]: Connection: close
[ALERT] 277/063058 (61489) : [910446773] process_stream:1655:
task=0x80244b9b0 s=0x80247d600, sfl=0x0000048a, rq=0x80247d610,
rp=0x80247d650, exp(r,w)=910456773,0 rqf=00808200 rpf=8004a221 rqh=-103
rqt=0 rph=0 rpt=0 cs=7 ss=7, cet=0x0 set=0x0 retr=0
Yep so here rqh is in fact req->buf->i and as you noticed it's been
decremented a second time.

I'm seeing this which I find suspicious in hlua.c :

   5909
   5910                  /* skip the requests bytes. */
   5911                  bo_skip(si_oc(si), strm->txn->req.eoh + 2);

First I don't understand why "eoh+2", I suspect that it's for the CRLF
in which case it's wrong since it can be a lone LF. Second, I'm not
seeing sov being reset afterwards. Could you please just add this
after this line :

        strm->txn->req.next -= strm->txn->req.sov;
        strm->txn->req.sov = 0;
This did not seem to resolve the issue.

That's equivalent to what we're doing when dealing with a redirect (http.c:4258
if you're curious) since we also have to "eat" the request. There may be a few
other corner cases, the use-service mechanism is fairly new and puts its feet
in a place where things used to work just because they were trained to... But
it's a terribly powerful thing to have so we must fix it even if it needs a
few -stable cycles.

Thanks!
Willy

If you have any other idea where it might go wrong please let me know :)
Ill try and dig a little further tomorrow evening.

Regards,
PiBa-NL

Reply via email to