Re: SOLVED! (Was: 400 error on cookie string)
On Thu, Jan 05, 2017 at 10:24:30PM +0300, Aleksey Gordeev wrote: > Hi, I'm "cas". My Name is Aleksey Gordeev. I was using my company email. > Please set me as a reporter. Perfect, thanks for clearing this up. You did a great job and I'm often embarrassed not to name good reporters (and many want to remain discrete). I'm doing this right now. Cheers, Willy
Re: SOLVED! (Was: 400 error on cookie string)
Hi Willy, all, On 17-01-05 20:17:56, Willy Tarreau wrote: > "cas", if you want to be credited as a reporter of the issue, you > need to raise your hand very quickly now, because once the patch is > merged it will be too late. His name is Aleksey Gordeev, see 9060941483541...@web3g.yandex.ru, written on 2017/01/04. Cheers, Georg signature.asc Description: Digital signature
Re: SOLVED! (Was: 400 error on cookie string)
Hi, I'm "cas". My Name is Aleksey Gordeev. I was using my company email. Please set me as a reporter. 2017-01-05 22:17 GMT+03:00 Willy Tarreau: > Small update on this, Axel Reinhold faced an apparently different > issue on an SVN server until we noticed the requests were sent in > small chunks cut before the CRLF and experiencing the same problem. > He could confirm the patch fixes the problem for him as well, so > I'm going to merge the patch now. > > "cas", if you want to be credited as a reporter of the issue, you > need to raise your hand very quickly now, because once the patch is > merged it will be too late. > > Best regards, > Willy >
Re: SOLVED! (Was: 400 error on cookie string)
Small update on this, Axel Reinhold faced an apparently different issue on an SVN server until we noticed the requests were sent in small chunks cut before the CRLF and experiencing the same problem. He could confirm the patch fixes the problem for him as well, so I'm going to merge the patch now. "cas", if you want to be credited as a reporter of the issue, you need to raise your hand very quickly now, because once the patch is merged it will be too late. Best regards, Willy
Re: 400 error on cookie string
Get one strange error that I never seen before Total events captured on [05/Jan/2017:09:13:56.054] : 12 [05/Jan/2017:09:09:06.594] frontend http-in (#3): invalid request backend (#-1), server (#-1), event #11 src 70.192.67.217:3224, session #243701, session flags 0x0080 HTTP msg state 26, msg flags 0x, tx flags 0x HTTP chunk len 0 bytes, HTTP body len 0 bytes buffer flags 0x00808002, out 0 bytes, total 924 bytes pending 924 bytes, wrapping at 32768, error at position 902: 0 7B%22distinct_id%22%3A%20%222044eec9-0217-e611-80c1-4c72b97cbdb1%22%2C 00070+ %22%24initial_referrer%22%3A%20%22%24direct%22%2C%22%24initial_referri 00140+ ng_domain%22%3A%20%22%24direct%22%2C%22__mps%22%3A%20%7B%7D%2C%22__mps 00210+ o%22%3A%20%7B%7D%2C%22__mpa%22%3A%20%7B%7D%2C%22__mpu%22%3A%20%7B%7D%2 00280+ C%22__mpap%22%3A%20%5B%5D%2C%22tref1%22%3A%20%22xmonetize%22%2C%22tref 00350+ 2%22%3A%20%22emailsmaster%22%2C%22tref3%22%3A%20%22Daily%20Quiz%20%7C% 00420+ 20Trivia%20Clickers%200-14d%20%7C%208d%2B%20%7C%20Email1%20Theme%20%7C 00490+ %20ByActivityTime%20%7C%20Categories%20Since%2012%2F28%2F2016%22%2C%22 00560+ tref4%22%3A%20%22NA%22%2C%22emaildomain%22%3A%20%22gmail.com%22%2C%22l 00630+ anding%22%3A%20%22default%22%2C%22regdate%22%3A%20%222016-05-10%22%2C% 00700+ 22tref12%22%3A%20%22facebook-ad%22%2C%22tref13%22%3A%20%22TriviaQuesti 00770+ ons-2%22%2C%22tref14%22%3A%20%22US6%25VisitorsConv%22%2C%22tref15%22%3 00840+ A%20%22series%22%2C%22%24search_engine%22%3A%20%22google%22%7D; mp_mix 00910+ panel__c=1\r\n 00922 \r\n 05.01.2017, 09:25, "c...@xmonetize.net": Yes, please apply Willy's patch to the 1.7.1 release and tell us what happens. Everything looks good. No errors. About 1.6.11. It was in the first days of December. May be I've mixed up something.I can test again tomorrow if you want It was 1.6.10. I think I've made some mistakes in paths so maybe I've started as a service 1.7 and haproxy -vv 1.6. I'm not pro in nix, sorry for confusing you
Re: 400 error on cookie string
Yes, please apply Willy's patch to the 1.7.1 release and tell us what happens. Everything looks good. No errors. About 1.6.11. It was in the first days of December. May be I've mixed up something.I can test again tomorrow if you want It was 1.6.10. I think I've made some mistakes in paths so maybe I've started as a service 1.7 and haproxy -vv 1.6. I'm not pro in nix, sorry for confusing you
Re: 400 error on cookie string
Hi Willy, Am 03.01.2017 um 21:10 schrieb Willy Tarreau: It was not dropped since the server SACKed it (but until it's within the window the stack is free to change its mind). In fact following a TCP stream in wireshark never gives any useful information. You *always* need the absolute sequence numbers and ack numbers on each and every packet and you cannot even trust the payload reported on a packet because payload spanning over multiple packets is generally reported at the end. That's why in the end, the much dumber tcpdump is always much more reliable. So here's what we have : Thanks for taking the time to explain, it makes complete sense now. Sounds like the MSS "clamp" works just in one direction and it the other direction real PathMTU comes into play (router sends back "Fragmentation needed"). Usually MSS clamping will apply the MSS clamp in both directions (SYN and SYN/ACK). Anyway, the TCP is fine, that's the important thing. About 1.6.11. It was in the first days of December. May be I've mixed up something. I can test again tomorrow if you want Yes, please apply Willy's patch to the 1.7.1 release and tell us what happens. Regards, Lukas
SOLVED! (Was: 400 error on cookie string)
I found it and fixed it! It was me again who added a bug in 1.7 with the optimizations for large headers and large requests. If certain conditions are met, we could read the \r from previous data (as we guessed) and complain that the next byte was not an LF once the remaining part arrived. I'm intending to merge the attached patch. "cas", I'm willing to add you as the reporter here since you provided lots of very valuable information, but for this it would be nice if you had a name :-) I'll backport it to 1.7. Unfortunately there's no easy workaround on an existing configuration, so I'll produce 1.7.2 soon I guess. I'll try to add the remaining missing information to make dumps more accurate regarding the expected state (this would definitely had helped here). Cheers, Willy >From 5afd6967423caa2d55d4c5a5ae0e296b35e2292d Mon Sep 17 00:00:00 2001 From: Willy TarreauDate: Wed, 4 Jan 2017 14:44:46 +0100 Subject: BUG/MAJOR: http: fix risk of getting invalid reports of bad requests X-Bogosity: Ham, tests=bogofilter, spamicity=0.00, version=1.2.4 Commits 5f10ea3 ("OPTIM: http: improve parsing performance of long URIs") and 0431f9d ("OPTIM: http: improve parsing performance of long header lines") introduced a bug in the HTTP parser : when a partial request is read, the first part ends up on a 8-bytes boundary (or 4-byte on 32-bit machines), the end lies in the header field value part, and the buffer used to contain a CR character exactly after the last block, then the parser could be confused and read this CR character as being part of the current request, then switch to a new state waiting for an LF character. Then when the next part of the request appeared, it would read the character following what was erroneously mistaken for a CR, see that it is not an LF and fail on a bad request. The reason is that there's no control of and of parsing just after breaking out of the loop. One way to reproduce it is with this config : global stats socket /tmp/sock1 mode 666 level admin stats timeout 1d frontend px bind :8001 mode http timeout client 10s redirect location / And sending requests this way : $ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')" $ tcploop 8001 C P S:"$(dd if=/dev/zero bs=16384 count=1 2>/dev/null | tr '\000' '\r')" $ tcploop 8001 C P \ S:"GET / HTTP/1.0\r\nX-padding: 0123456789.123456789.123456789.123456789.123456789.123456789.1234567" P \ S:"89.123456789\r\n\r\n" P Then a "show errors" on the socket will report : $ echo "show errors" | socat - /tmp/sock1 Total events captured on [04/Jan/2017:15:09:15.755] : 32 [04/Jan/2017:15:09:13.050] frontend px (#2): invalid request backend (#-1), server (#-1), event #31 src 127.0.0.1:59716, session #91, session flags 0x0080 HTTP msg state 17, msg flags 0x, tx flags 0x HTTP chunk len 0 bytes, HTTP body len 0 bytes buffer flags 0x00808002, out 0 bytes, total 111 bytes pending 111 bytes, wrapping at 16384, error at position 107: 0 GET / HTTP/1.0\r\n 00016 X-padding: 0123456789.123456789.123456789.123456789.123456789.12345678 00086+ 9.123456789.123456789\r\n 00109 \r\n This fix must be backported to 1.7. --- src/proto_http.c | 8 1 file changed, 8 insertions(+) diff --git a/src/proto_http.c b/src/proto_http.c index aa8d997..d894ef6 100644 --- a/src/proto_http.c +++ b/src/proto_http.c @@ -1557,6 +1557,10 @@ const char *http_parse_reqline(struct http_msg *msg, ptr += sizeof(int); } #endif + if (ptr >= end) { + state = HTTP_MSG_RQURI; + goto http_msg_ood; + } http_msg_rquri2: if (likely((unsigned char)(*ptr - 33) <= 93)) /* 33 to 126 included */ EAT_AND_JUMP_OR_RETURN(http_msg_rquri2, HTTP_MSG_RQURI); @@ -1983,6 +1987,10 @@ void http_msg_analyzer(struct http_msg *msg, struct hdr_idx *idx) ptr += sizeof(int); } #endif + if (ptr >= end) { + state = HTTP_MSG_HDR_VAL; + goto http_msg_ood; + } http_msg_hdr_val2: if (likely(!HTTP_IS_CRLF(*ptr))) EAT_AND_JUMP_OR_RETURN(http_msg_hdr_val2, HTTP_MSG_HDR_VAL); -- 1.7.12.1
Re: 400 error on cookie string
Hi guys, On Tue, Jan 03, 2017 at 09:10:21PM +0100, Willy Tarreau wrote: > However when looking at the error capture, the request is correct. But as > you can see, it's reported as wrong from offset 1453, hence one byte past > the end of the first segment. Thus I suspect that we have something wrong > here when receiving an incomplete request in this particular state. Maybe > we read a value too far, find a \0 and abort, and by the time we get to > the error dump the rest of the request comes in and appears in the capture. So I tried to inject these data into haproxy 1.7.1 and no luck, it never fails. I even tried to start it with -dM0 to ensure the buffer was filled with forbidden chars and it never triggers, as can be seen below. I'll see what to change in the code to get a copy of the parser's state in the dump, hoping that will sched a bit more light on the problem. Willy -- 11:38:56.314991 accept4(5, {sa_family=AF_INET, sin_port=htons(60288), sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 6 11:38:56.315073 setsockopt(6, SOL_TCP, TCP_NODELAY, [1], 4) = 0 11:38:56.315139 accept4(5, 0x7ffd8be8fc40, [128], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable) 11:38:56.315214 recvfrom(6, 0xaa3214, 15360, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) 11:38:56.315277 epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|0x2000, {u32=6, u64=6}}) = 0 11:38:56.315327 gettimeofday({1483526336, 315340}, NULL) = 0 11:38:56.315371 epoll_wait(3, {{EPOLLIN, {u32=6, u64=6}}}, 200, 1000) = 1 11:38:56.415320 gettimeofday({1483526336, 415353}, NULL) = 0 11:38:56.415407 recvfrom(6, "GET /?fm=1_url=is-there-caffeine-in-chocolate_type=question=1e4aa05d-f9a2-46f8-8d4f-a59bf9cdb3ad=43049_rd=_s=sr7imz238=sr7imz238=3048187=xmonetize="..., 15360, 0, NULL, NULL) = 1452 11:38:56.415667 setsockopt(6, SOL_TCP, TCP_QUICKACK, [1], 4) = 0 11:38:56.415778 gettimeofday({1483526336, 415800}, NULL) = 0 11:38:56.415840 epoll_wait(3, {{EPOLLIN, {u32=6, u64=6}}}, 200, 1000) = 1 11:38:57.415765 gettimeofday({1483526337, 415806}, NULL) = 0 11:38:57.415857 recvfrom(6, "0%22Daily%20Quiz%20%7C%20Tests%20Clickers%200-14d%20Any%2024k%2Fd%20%7C%2011d%2B%20%7C%20Theme%20%7C%201%20PM%20LiveIntent%22%2C%22tref4%22%3A%20%22NA%22%2C%22emaildomain%22%3A%20%22gmail.com%22%2C%22"..., 13908, 0, NULL, NULL) = 1453 11:38:57.415988 setsockopt(6, SOL_TCP, TCP_QUICKACK, [1], 4) = 0 11:38:57.416072 gettimeofday({1483526337, 416089}, NULL) = 0 11:38:57.416126 epoll_wait(3, {}, 200, 1000) = 0 11:38:58.417279 gettimeofday({1483526338, 417344}, NULL) = 0 --
Re: 400 error on cookie string
On Thu, Dec 29, 2016 at 08:31:51AM +0100, Willy Tarreau wrote: > I'm puzzled. I'm going to check the error path in the code as it's something > I've never ever seen yet! Thus I suspect a regression in 1.7 compared to 1.6. One of the report mentions 1.6.10 there, it's even more troubling. I'm noting that all captures have txflags=0x8000 which means "not first request". That's something important to take into consideration. Given that the report indicates that it fails with large requests, we can easily imagine a scenario where some requests are pipelined and wrap at the end of the buffer causing a realign to be performed. Thus maybe ther problem lies in this area. Willy
Re: 400 error on cookie string
Hello, Am 28.12.2016 um 11:38 schrieb c...@xmonetize.net: Tcpdump shows normal http requests(attached) Ok, I've looked through those traces and a lot of those HTTP 400 errors come from Browser pre-connect feature - which is not a problem. However, the TCP session 410 (tcp.stream eq 410) with port 49372 (tcp.port == 49372) shows a problem that I cannot explain. It looks like haproxy is jumping the gun here, replying with "400 Bad request" before the request is even complete. Are you sure that haproxy was configured with "timeout http-request 1m" when this trace was taken (December 22th)? Thanks, Lukas