Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Hi Cyril, On Tue, Jan 14, 2014 at 08:23:00AM +0100, Willy Tarreau wrote: Hey, excellent catch! You're absolutely right. I'm totally ashamed for not having found it while reading the code. I was searching for a place where a wrong computation could lead to something larger than the buffer and forgot to check for multiple reads of the buffer's size :-) Now thinking about it a little bit more, I think we have an API problem in fact. The raw_sock_to_buf() functions says : /* Receive up to count bytes from connection conn's socket and store them * into buffer buf. The caller must ensure that count is always smaller * than the buffer's size. */ But as you found, this is misleading as it doesn't work that well, since the caller needs to take care of not asking for too much data. So I'm thinking about changing the API instead so that the caller doesn't have to care abou this and that only the read functions do. Anyway, they already care about free space wrapping at the end of the buffer. So I'd rather fix raw_sock_to_buf() and ssl_sock_to_buf() with a patch like this one, and simplify the logic at some call places. It would make the code much more robust and protect us against such bugs in the future. Could you please give it a try in your environment ? Thanks, Willy diff --git a/src/raw_sock.c b/src/raw_sock.c index 4dc1c7a..2e3a0cb 100644 --- a/src/raw_sock.c +++ b/src/raw_sock.c @@ -226,8 +226,7 @@ int raw_sock_from_pipe(struct connection *conn, struct pipe *pipe) /* Receive up to count bytes from connection conn's socket and store them - * into buffer buf. The caller must ensure that count is always smaller - * than the buffer's size. Only one call to recv() is performed, unless the + * into buffer buf. Only one call to recv() is performed, unless the * buffer wraps, in which case a second call may be performed. The connection's * flags are updated with whatever special event is detected (error, read0, * empty). The caller is responsible for taking care of those events and @@ -239,7 +238,7 @@ int raw_sock_from_pipe(struct connection *conn, struct pipe *pipe) static int raw_sock_to_buf(struct connection *conn, struct buffer *buf, int count) { int ret, done = 0; - int try = count; + int try; if (!(conn-flags CO_FL_CTRL_READY)) return 0; @@ -258,24 +257,27 @@ static int raw_sock_to_buf(struct connection *conn, struct buffer *buf, int coun } } - /* compute the maximum block size we can read at once. */ - if (buffer_empty(buf)) { - /* let's realign the buffer to optimize I/O */ + /* let's realign the buffer to optimize I/O */ + if (buffer_empty(buf)) buf-p = buf-data; - } - else if (buf-data + buf-o buf-p -buf-p + buf-i buf-data + buf-size) { - /* remaining space wraps at the end, with a moving limit */ - if (try buf-data + buf-size - (buf-p + buf-i)) - try = buf-data + buf-size - (buf-p + buf-i); - } /* read the largest possible block. For this, we perform only one call * to recv() unless the buffer wraps and we exactly fill the first hunk, * in which case we accept to do it once again. A new attempt is made on * EINTR too. */ - while (try) { + while (count 0) { + /* first check if we have some room after p+i */ + try = buf-data + buf-size - (buf-p + buf-i); + /* otherwise continue between data and p-o */ + if (try = 0) { + try = buf-p - (buf-data + buf-o); + if (try = 0) + break; + } + if (try count) + try = count; + ret = recv(conn-t.sock.fd, bi_end(buf), try, 0); if (ret 0) { @@ -291,7 +293,6 @@ static int raw_sock_to_buf(struct connection *conn, struct buffer *buf, int coun break; } count -= ret; - try = count; } else if (ret == 0) { goto read0;
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
OK here's a proposed fix which addresses the API issue for both raw_sock and ssl_sock. Steve, it would be nice if you could give it a try just to confirm I didn't miss anything. Thanks, Willy From 3e499a6da1ca070f23083c874aa48895f00d0d6f Mon Sep 17 00:00:00 2001 From: Willy Tarreau w...@1wt.eu Date: Tue, 14 Jan 2014 11:31:27 +0100 Subject: BUG/MAJOR: connection: fix mismatch between rcv_buf's API and usage MIME-Version: 1.0 Content-Type: text/plain; charset=latin1 Content-Transfer-Encoding: 8bit Steve Ruiz reported some reproducible crashes with HTTP health checks on a certain page returning a huge length. The traces he provided clearly showed that the recv() call was performed twice for a total size exceeding the buffer's length. Cyril Bonté tracked down the problem to be caused by the full buffer size being passed to rcv_buf() in event_srv_chk_r() instead of passing just the remaining amount of space. Indeed, this change happened during the connection rework in 1.5-dev13 with the following commit : f150317 MAJOR: checks: completely use the connection transport layer But one of the problems is also that the comments at the top of the rcv_buf() functions suggest that the caller only has to ensure the requested size doesn't overflow the buffer's size. Also, these functions already have to care about the buffer's size to handle wrapping free space when there are pending data in the buffer. So let's change the API instead to more closely match what could be expected from these functions : - the caller asks for the maximum amount of bytes it wants to read ; This means that only the caller is responsible for enforcing the reserve if it wants to (eg: checks don't). - the rcv_buf() functions fix their computations to always consider this size as a max, and always perform validity checks based on the buffer's free space. As a result, the code is simplified and reduced, and made more robust for callers which now just have to care about whether they want the buffer to be filled or not. Since the bug was introduced in 1.5-dev13, no backport to stable versions is needed. --- src/checks.c | 2 +- src/raw_sock.c | 31 --- src/ssl_sock.c | 29 +++-- 3 files changed, 32 insertions(+), 30 deletions(-) diff --git a/src/checks.c b/src/checks.c index 3237304..2274136 100644 --- a/src/checks.c +++ b/src/checks.c @@ -2065,7 +2065,7 @@ static void tcpcheck_main(struct connection *conn) goto out_end_tcpcheck; if ((conn-flags CO_FL_WAIT_RD) || - conn-xprt-rcv_buf(conn, check-bi, buffer_total_space(check-bi)) = 0) { + conn-xprt-rcv_buf(conn, check-bi, check-bi-size) = 0) { if (conn-flags (CO_FL_ERROR | CO_FL_SOCK_RD_SH | CO_FL_DATA_RD_SH)) { done = 1; if ((conn-flags CO_FL_ERROR) !check-bi-i) { diff --git a/src/raw_sock.c b/src/raw_sock.c index 4dc1c7a..2e3a0cb 100644 --- a/src/raw_sock.c +++ b/src/raw_sock.c @@ -226,8 +226,7 @@ int raw_sock_from_pipe(struct connection *conn, struct pipe *pipe) /* Receive up to count bytes from connection conn's socket and store them - * into buffer buf. The caller must ensure that count is always smaller - * than the buffer's size. Only one call to recv() is performed, unless the + * into buffer buf. Only one call to recv() is performed, unless the * buffer wraps, in which case a second call may be performed. The connection's * flags are updated with whatever special event is detected (error, read0, * empty). The caller is responsible for taking care of those events and @@ -239,7 +238,7 @@ int raw_sock_from_pipe(struct connection *conn, struct pipe *pipe) static int raw_sock_to_buf(struct connection *conn, struct buffer *buf, int count) { int ret, done = 0; - int try = count; + int try; if (!(conn-flags CO_FL_CTRL_READY)) return 0; @@ -258,24 +257,27 @@ static int raw_sock_to_buf(struct connection *conn, struct buffer *buf, int coun } } - /* compute the maximum block size we can read at once. */ - if (buffer_empty(buf)) { - /* let's realign the buffer to optimize I/O */ + /* let's realign the buffer to optimize I/O */ + if (buffer_empty(buf)) buf-p = buf-data; - } - else if (buf-data + buf-o buf-p -buf-p + buf-i buf-data + buf-size) { - /* remaining space wraps at the end, with a moving limit */ - if (try buf-data + buf-size - (buf-p + buf-i)) - try = buf-data + buf-size - (buf-p + buf-i); - } /* read the largest possible block. For this, we perform only one call * to recv() unless the buffer wraps and we exactly fill the first hunk, * in which
Just a simple thought on health checks after a soft reload of HAProxy....
Just a simple though on health checks after a soft reload of HAProxy If for example you had several backend servers one of which had crashed... Then you make make a configuration change to HAProxy and soft reload, for instance adding a new backend server. All the servers are instantly brought up and available for traffic (including the crashed one). So traffic will possibly be sent to a broken server... Obviously its only a small problem as it is fixed as soon as the health check actually runs... But I was just wondering is their a way of saying don't bring up a server until it passes a health check? -- Regards, Malcolm Turnbull. Loadbalancer.org Ltd. Phone: +44 (0)870 443 8779 http://www.loadbalancer.org/
Re: route !HTTP connections to tcp backend instead of dropping in HTTP mode
Lukasz Michalski lm@... writes: I setup two haproxy instances - one in tcp mode for protocol detection and the second one for routing http requests application servers. Works like a charm on my development machine. Thanks again! Łukasz Dear Lukasz, Could you maybe post the config files of your two haproxy instances? Thanks!
Error 400 - Error 502
Hi, we are using HAProxy 1.5-dev21 as loadbalancer frontend and some NGINX 1.4 servers as backend. Everything is working perfect. Great piece of software! One question: If some client sends a request like: GET something.someone This request is forwarded to the backends even if it is not a valid request. The backend (NGINX) replies with an error page 400 (Bad Request). This is an invalid response to HAProxy which will deliver a 502 (Bad Gateway) error to the client. Response to client: 2014-01-14T15:17:04+01:00 somehostname haproxy[9]: xxx.1xx.2xx.xx:50294 [14/Jan/2014:15:17:04.648] application8089 application8089/somehostname.some.domain 0/0/0/-1/1 502 370 - - PH-- 475/441/8/0/0 0/0 \GET usacording.com HTTP/1.0\ HAProxy error: [14/Jan/2014:16:52:07.357] backend application8089 (#11) : invalid response frontend application8089 (#11), server somehostname.some.domain (#2), event #2211 src xxx.1xx.2xx.xx:55112, session #2141081, session flags 0x048e HTTP msg state 26, msg flags 0x, tx flags 0x0830 HTTP chunk len 0 bytes, HTTP body len 0 bytes buffer flags 0x8023, out 0 bytes, total 166 bytes pending 166 bytes, wrapping at 16384, error at position 0: 0 html\r\n 8 headtitle400 Bad Request/title/head\r\n 00053 body bgcolor=white\r\n 00077 centerh1400 Bad Request/h1/center\r\n 00120 hrcenternginx/center\r\n 00148 /body\r\n 00157 /html\r\n Is it possible to respond the original error (400) to the client? Regards, Florian -- EveryWare AG Florian Engelmann Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: mailto:florian.engelm...@everyware.ch web: http://www.everyware.ch
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Hi again Willy, Le 14/01/2014 12:22, Willy Tarreau a écrit : OK here's a proposed fix which addresses the API issue for both raw_sock and ssl_sock. Steve, it would be nice if you could give it a try just to confirm I didn't miss anything. OK, from my side, now I'm on the laptop where I can reproduce the segfault, I confirm it doesn't crash anymore once the patch is applied (which was predictable from the quick test I made this afternoon). Let's see if it's OK for Steve too ;-) -- Cyril Bonté
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Willy, have you validated this version in our lab as well Baptiste Le 14 janv. 2014 19:21, Cyril Bonté cyril.bo...@free.fr a écrit : Hi again Willy, Le 14/01/2014 12:22, Willy Tarreau a écrit : OK here's a proposed fix which addresses the API issue for both raw_sock and ssl_sock. Steve, it would be nice if you could give it a try just to confirm I didn't miss anything. OK, from my side, now I'm on the laptop where I can reproduce the segfault, I confirm it doesn't crash anymore once the patch is applied (which was predictable from the quick test I made this afternoon). Let's see if it's OK for Steve too ;-) -- Cyril Bonté
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
Patched and confirmed in our environment that this is now working / seems to have fixed the issue. Thanks! Steve Ruiz On Tue, Jan 14, 2014 at 3:22 AM, Willy Tarreau w...@1wt.eu wrote: OK here's a proposed fix which addresses the API issue for both raw_sock and ssl_sock. Steve, it would be nice if you could give it a try just to confirm I didn't miss anything. Thanks, Willy -- CONFIDENTIALITY NOTICE: The information contained in this electronic transmission may be confidential. If you are not an intended recipient, be aware that any disclosure, copying, distribution or use of the information contained in this transmission is prohibited and may be unlawful. If you have received this transmission in error, please notify us by email reply and then erase it from your computer system.
Multiple monitors?
I searched for monitor in the archives and read back about a year, and didn't see this asked. I did see Apologies if it's something that has already been asked and answered. In my application it would be useful to have multiple monitors, because I'd like to use HAProxy as a poor man's network monitor, since it already knows the state of each of its backends. So, for example, if I had three backends--let's call them static, app, and service--I'd like to be able to define: /monitor/alive /monitor/healthy Those would be true if, in the first case, at least one instance of each backend was OK, and in the second case, if all instances of all backends were OK. Then I'd also like: /monitor/static/alive /monitor/static/healthy and so on for app and service. *THEN* I could simply iterate through my list of monitor endpoints and quickly build a graph of the system's overall health. As it stands, though, you get one monitor URI per frontend. It looks like it would take some surgery to turn that from a simple string to a hash associating names with particular monitors, and then a separate pointer per name referring to an ACL to do the monitor fail condition. So I'm wondering the following: 1) is this a lot harder than it looks? Obviously there's a bit more computation to do the is this URI a monitor-uri hash key? If, look up the associated ACL than the strcmp it currently is. That doesn't feel like it should be a huge performance penalty, but, well, a C hash is a lot heavier than a strcmp. Even so, strmap (http://pokristensson.com/strmap.html) (which requires both keys and values to be strings, so it's not a totally generic hash implementation) would be perfectly OK for this purpose, and it's LGPL, so should be safe to integrate. 2) Am I thinking about this the wrong way? That is, is the thing I want to do just silly and I should be going about it some other way? 3) Am I really the first person to make this feature request? It seems like an obvious thing to want to do. I seem to recall at some point finding some discussion that you could do this by chaining haproxies, but that seems much heavier-weight than just implementing: monitor-hash app-healthy /monitor/app/healthy monitor-fail-hash app-healthy nbserv(app) lt 4 Thanks, Adam
Re: Bug report for latest dev release, 1.5.21, segfault when using http expect string x and large 404 page (includes GDB output)
On Tue, Jan 14, 2014 at 12:25:37PM -0800, Steve Ruiz wrote: Patched and confirmed in our environment that this is now working / seems to have fixed the issue. Thanks! Great, many thanks to you both guys. We've got rid of another pretty old bug, these are the ones that make me the happiest once fixed! I'm currently unpacking my laptop to push the fix so that it appears in todays snapshot. Excellent work! Willy
Re: Thousands of FIN_WAIT_2 CLOSED ESTABLISHED in haproxy1.5-dev21-6b07bf7
Hi! Thanks for your reply! We finally found out that the directive in our haproxy.conf tcp-request inspect-delay 30s made this error happened. I think this because the global settings in our defualts: timeout client 60s --- But the tcp-request inspect-delay 30s in our frontend didn't cover the timeout. the documentation says: http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#tcp-request inspect-delay Note that the client timeout must cover at least the inspection delay, otherwise it will expire first After we change the tcp-request inspect-delay 30s to tcp-request inspect-delay 60s,it works like a charm!
Compile Error in abf08d (2004/01/15)
Hi! In today repository abf08d, compile error/warning occurred as below. My OS Environment : CentOS 6.4 x86_64 In file included from src/listener.c:18: include/common/accept4.h:61: error: static declaration of 'accept4' follows non-static declaration /usr/include/sys/socket.h:222: note: previous declaration of 'accept4' was here make: *** [src/listener.o] Error 1 src/ssl_sock.c: In function 'ssl_sock_to_buf': src/ssl_sock.c:1356: warning: 'try' may be used uninitialized in this function
Forward request with the URL path
Hello, Is it possible to forward an incoming request to the *backend *by retaining the URL path in *http *mode?. Using *ACLs *I was able to categorize the incoming requests to different rulesets, but on using a specific *backend *for a certain URL parth, I could not figure out how to send the request to the underlying server *with *the URL path?. I figured out how to redirect a request, but that is not what I wanted. So, if HAProxy receives a request on, say *http://192.168.5.10:80/url/path http://192.168.5.10:80/url/path* , I want the request to be sent to the underlying server as *http://133.168.5.20:80/url/path http://133.168.5.20:80/url/path.* Does anyone know how to do this?. Any help is much appreciated!. PS: Question on ServerFault : http://serverfault.com/questions/566862/haproxy-url-pattern-forwarding Thanks, Rakesh
Re: Thousands of FIN_WAIT_2 CLOSED ESTABLISHED in haproxy1.5-dev21-6b07bf7
2014/1/15 Ge Jin altman87...@gmail.com: Hi! Thanks for your reply! We finally found out that the directive in our haproxy.conf tcp-request inspect-delay 30s made this error happened. I think this because the global settings in our defualts: timeout client 60s --- But the tcp-request inspect-delay 30s in our frontend didn't cover the timeout. Thanks. And which could be the low-level explanation for this ?