Re: appsession does not work in Haproxy 1.4.9
Hi Hogan, I have good news. I found two bugs that could cause what you describe in case of out of memory conditions, and I could trigger them with your config by limiting my amount of memory. One of them is triggered by the cookie capture. It it failed to allocate memory to store the cookie, it would emit an alert but still use it. That was the case only for the response cookies, the request cookies has the correct fix. Not sure how that slipped through fixes, but it's fixed now. The other one was caused by a risk of freeing a non-allocated pool in the appsession in case of memory shortage : some fields were released while not yet allocated, resulting in a segfault too during the call to the libc's free(). However, I found a third bug with my glibc : the call to regexec() spins forever when it fails to allocate memory. It is very possible that with your libc, instead of looping it crashes. And the only way to have this code called with your config is precisely by adding the reqirep rule : 0x6f755d7d in mmap () from /lib/libc.so.6 (gdb) bt #0 0x6f755d7d in mmap () from /lib/libc.so.6 #1 0x6f703bb4 in sYSMALLOc () from /lib/libc.so.6 #2 0x6f702835 in _int_realloc () from /lib/libc.so.6 #3 0x6f701100 in realloc () from /lib/libc.so.6 #4 0x6f731d6f in re_string_realloc_buffers () from /lib/libc.so.6 #5 0x6f73dfb3 in extend_buffers () from /lib/libc.so.6 #6 0x6f73b222 in transit_state () from /lib/libc.so.6 #7 0x6f739a6f in check_matching () from /lib/libc.so.6 #8 0x6f7392a5 in re_search_internal () from /lib/libc.so.6 #9 0x6f73874d in regexec@@GLIBC_2.3.4 () from /lib/libc.so.6 #10 0x0806b8da in apply_filter_to_req_line (t=0x0, req=0x8ba5da8, exp=0x8b07b08) at src/proto_http.c:5534 #11 0x0806badc in apply_filters_to_request (s=0x8ba5650, req=0x8ba5da8, px=0x8afec80) at src/proto_http.c:5649 #12 0x080682fe in http_process_req_common (s=0x8ba5650, req=0x8ba5da8, an_bit=-12, px=0x8afec80) at src/proto_http.c:3013 #13 0x0807e776 in process_session (t=0x8ba5a08) at src/session.c:1068 #14 0x0804ec8d in process_runnable_tasks (next=0x77bcd29c) at src/task.c:234 #15 0x0804b005 in run_poll_loop () at src/haproxy.c:974 #16 0x0804b389 in main (argc=6, argv=0x77bcd334) at src/haproxy.c:1255 I'm used to build with libpcre which is much faster than libc and which I never got to fail. So what I'm suggesting for your case : 1) I'm certain that you're running out of memory, maybe because of too many learned cookies, or maybe because your machine is under-sized for the amount of traffic that passes through. However, since you're saying it takes one hour to die, I suspect that it's a combination of both. You need much memory, and the appsession that slowly piles up progressively reduces the amount of memory available until it dies. So you must find why you have so little memory (maybe a ulimit -m or something like this). 2) rebuild with support for libpcre, which handles out of memory conditions much better than libc. An out of memory should in theory never happen, but since it does in your case, let's be careful. 3) apply the two patches below to your current version to fix the two bugs : http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=62e360 http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=75eae4 Best regards, Willy
Re: appsession does not work in Haproxy 1.4.9
Hi Willy, I've been following this with some interest (I'm due to roll out 1.4.9 on a number of loadbalancers soon), and just wondered if these patches are recommended for application even if you do build with libpcre (as I also do), or whether it should be safe to deploy 1.4.9 as is? Cheers Chris -Original Message- From: Willy Tarreau w...@1wt.eu To: Hogan Yu hogan...@icebreakersoftware.com Cc: Cyril Bonté cyril.bo...@free.fr, haproxy@formilux.org Subject: Re: appsession does not work in Haproxy 1.4.9 Date: Fri, 19 Nov 2010 12:02:30 +0100 Hi Hogan, I have good news. I found two bugs that could cause what you describe in case of out of memory conditions, and I could trigger them with your config by limiting my amount of memory. One of them is triggered by the cookie capture. It it failed to allocate memory to store the cookie, it would emit an alert but still use it. That was the case only for the response cookies, the request cookies has the correct fix. Not sure how that slipped through fixes, but it's fixed now. The other one was caused by a risk of freeing a non-allocated pool in the appsession in case of memory shortage : some fields were released while not yet allocated, resulting in a segfault too during the call to the libc's free(). However, I found a third bug with my glibc : the call to regexec() spins forever when it fails to allocate memory. It is very possible that with your libc, instead of looping it crashes. And the only way to have this code called with your config is precisely by adding the reqirep rule : 0x6f755d7d in mmap () from /lib/libc.so.6 (gdb) bt #0 0x6f755d7d in mmap () from /lib/libc.so.6 #1 0x6f703bb4 in sYSMALLOc () from /lib/libc.so.6 #2 0x6f702835 in _int_realloc () from /lib/libc.so.6 #3 0x6f701100 in realloc () from /lib/libc.so.6 #4 0x6f731d6f in re_string_realloc_buffers () from /lib/libc.so.6 #5 0x6f73dfb3 in extend_buffers () from /lib/libc.so.6 #6 0x6f73b222 in transit_state () from /lib/libc.so.6 #7 0x6f739a6f in check_matching () from /lib/libc.so.6 #8 0x6f7392a5 in re_search_internal () from /lib/libc.so.6 #9 0x6f73874d in regexec@@GLIBC_2.3.4 () from /lib/libc.so.6 #10 0x0806b8da in apply_filter_to_req_line (t=0x0, req=0x8ba5da8, exp=0x8b07b08) at src/proto_http.c:5534 #11 0x0806badc in apply_filters_to_request (s=0x8ba5650, req=0x8ba5da8, px=0x8afec80) at src/proto_http.c:5649 #12 0x080682fe in http_process_req_common (s=0x8ba5650, req=0x8ba5da8, an_bit=-12, px=0x8afec80) at src/proto_http.c:3013 #13 0x0807e776 in process_session (t=0x8ba5a08) at src/session.c:1068 #14 0x0804ec8d in process_runnable_tasks (next=0x77bcd29c) at src/task.c:234 #15 0x0804b005 in run_poll_loop () at src/haproxy.c:974 #16 0x0804b389 in main (argc=6, argv=0x77bcd334) at src/haproxy.c:1255 I'm used to build with libpcre which is much faster than libc and which I never got to fail. So what I'm suggesting for your case : 1) I'm certain that you're running out of memory, maybe because of too many learned cookies, or maybe because your machine is under-sized for the amount of traffic that passes through. However, since you're saying it takes one hour to die, I suspect that it's a combination of both. You need much memory, and the appsession that slowly piles up progressively reduces the amount of memory available until it dies. So you must find why you have so little memory (maybe a ulimit -m or something like this). 2) rebuild with support for libpcre, which handles out of memory conditions much better than libc. An out of memory should in theory never happen, but since it does in your case, let's be careful. 3) apply the two patches below to your current version to fix the two bugs : http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=62e360 http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=75eae4 Best regards, Willy
haproxy gives 502 on links with utf-8 chars?!
Hi. I have a haproxy doing load balacing between two apache servers which have mod_jk. Application is on JBoss application server. Problem that I have noticed is that if link has some UTF-8 character (Croatian language characters), then haproxy gives error 502. Here is example from log: Nov 19 12:40:24 porat haproxy[28047]: aaa.bbb.ccc.ddd:port [19/Nov/2010:12:40:24.040] www www/backend-srv1 0/0/0/-1/135 502 1833 - - PHVN 1/1/1/0/0 0/0 GET /pithos/rest/usern...@domain/files/folder%C4%8Di%C4%87/ HTTP/1.1 Nov 19 12:40:34 porat haproxy[28047]: aaa.bbb.ccc.ddd:port [19/Nov/2010:12:40:34.710] www www/backend-srv1 0/0/0/-1/82 502 1061 - - PHVN 5/5/5/4/0 0/0 GET /pithos/rest/usern...@domain/files/%C4%8D%C4%87%C5%A1%C4%91%C5%BE/ HTTP/1.1 Problem only occurs for links with those specific characters. Interesting thing is that haproxy is the reason for that errors, because when I try to get those same links directly from backend servers, links work without problem... Here is a log from apache backend: aaa.bbb.ccc.ddd - - [19/Nov/2010:12:40:24 +0100] GET /pithos/rest/usern...@domain/files/folder%C4%8Di%C4%87/ HTTP/1.1 200 1185 http://somethin.somedomain/pithos/A5707EF1550DF3AECFB3F1CB7B89E240.cache.html; Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/7.0.536.2 Safari/534.10 Any ideas? Is there anything how I can debug this? -- Jakov Sosic
Re: appsession does not work in Haproxy 1.4.9
Hi Chris, On Fri, Nov 19, 2010 at 11:25:04AM +, Chris Sarginson wrote: Hi Willy, I've been following this with some interest (I'm due to roll out 1.4.9 on a number of loadbalancers soon), and just wondered if these patches are recommended for application even if you do build with libpcre (as I also do), or whether it should be safe to deploy 1.4.9 as is? All the issues can only happen in out of memory situation. This can typically happen on some under-sized VMs or situations like that. If you know you're running at moderate loads with large amounts of free memory, you don't need to worry. By the way, these issues have been there for more than 5 years and were encountered for the first time now. They are not regressions, so there is no reason to hurry. That said, if you are used to build your own packages, it would obviously be better to include the latest fixes. The point about libpcre is that the libc's regex lib hangs on low memory conditions while libpcre handles the situation fine. This is the third issue of the list, for which there it no patch since the bug is outside haproxy. It is possible that recent a libc would handle the situation better though. Cheers, Willy
Re: haproxy gives 502 on links with utf-8 chars?!
Hi Jakov, On Fri, Nov 19, 2010 at 01:06:39PM +0100, Jakov Sosic wrote: Hi. I have a haproxy doing load balacing between two apache servers which have mod_jk. Application is on JBoss application server. Problem that I have noticed is that if link has some UTF-8 character (Croatian language characters), then haproxy gives error 502. Here is example from log: Nov 19 12:40:24 porat haproxy[28047]: aaa.bbb.ccc.ddd:port [19/Nov/2010:12:40:24.040] www www/backend-srv1 0/0/0/-1/135 502 1833 - - PHVN 1/1/1/0/0 0/0 GET /pithos/rest/usern...@domain/files/folder%C4%8Di%C4%87/ HTTP/1.1 Nov 19 12:40:34 porat haproxy[28047]: aaa.bbb.ccc.ddd:port [19/Nov/2010:12:40:34.710] www www/backend-srv1 0/0/0/-1/82 502 1061 - - PHVN 5/5/5/4/0 0/0 GET /pithos/rest/usern...@domain/files/%C4%8D%C4%87%C5%A1%C4%91%C5%BE/ HTTP/1.1 Problem only occurs for links with those specific characters. Interesting thing is that haproxy is the reason for that errors, because when I try to get those same links directly from backend servers, links work without problem... The issue is with the response, not the request (flags PH). If you have enabled you stats socket, you can get the exact location of the error that way : # echo show errors | socat stdio unix-connect:/var/run/haproxy.sock (or whatever the path to the socket). This will be useful because it indicates that one character in the response was not valid from an HTTP point of view. Normally if the error is not too serious, you can force haproxy to let it pass with this option in your backend : option accept-invalid-http-response However, you should only do that once you've figured what the error is and you need time to fix it, because unless there is a bug in haproxy, it generally indicates a wrong header name in the response from the server. Regards, Willy
Re: Support for SSL
On Wed, Nov 17, 2010 at 09:46:05AM -0500, John Marrett wrote: Bedis, Cause using the cores to decrypt traffic would reduce drastically overall performance. Well, this is what we saw on our HTTP cache server (running CentOS) on 8 cores hardware: when enabling SSL, the performance were so bad that So we kept our old Nortel vpn 3050 to handle the SSL traffic. I'm astonished to hear that you had these kinds of issues on modern hardware. We stopped using dedicated SSL hardware quite some time ago. I'm not surprized at all. The issue generally lies in mixing high latency processing (eg: SSL) with low latency (eg: HTTP processing). When your CPUs are stuck for 200 microseconds processing an SSL connection, you can try to do whatever you want, all pending HTTP processing will be stuck that long, which will considerably limit the request rate. One of the solution sometimes is to dedicate some CPUs to slow processing and others to fast processing, but this is not always possible. Cheers, Willy
Re: haproxy gives 502 on links with utf-8 chars?!
On 11/19/2010 01:47 PM, Willy Tarreau wrote: echo show errors | socat stdio unix-connect:/var/run/haproxy.sock # echo show errors | socat stdio unix-connect:/var/run/haproxy.sock [19/Nov/2010:15:01:56.646] backend www (#1) : invalid response src aaa.bbb.ccc.ddd, session #645, frontend www (#1), server backend-srv1 (#1) response length 857 bytes, error at position 268: 0 HTTP/1.1 200 OK\r\n 00017 Date: Fri, 19 Nov 2010 14:01:56 GMT\r\n 00054 Server: Apache/2.2.3 (CentOS)\r\n 00085 X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1\r\n 00136 Expires: -1\r\n 00149 X-GSS-Metadata: {creationDate:1290002859579,createdBy:ngara...@sr 00219+ ce.hr,modifiedBy:usern...@domain,name:a\r\x07\x11~,owner: 00282+ usern...@domain,modificationDate:1290002859579,deleted:false}\r 00350+ \n 00351 Content-Length: 418\r\n 00372 Connection: close\r\n 00391 Content-Type: application/json;charset=UTF-8\r\n 00437 \r\n 00439 {files:[],creationDate:1290002859579,createdBy:usern...@domain 00509+ ,modifiedBy:usern...@domain,readForAll:false,name:\xC5\xA1 00572+ \xC4\x8D\xC4\x87\xC4\x91\xC5\xBE,permissions:[{modifyACL:true,wr 00618+ ite:true,read:true,user:usern...@domain}],owner:usern...@domain 00688+ ce.hr,parent:{name:User User,uri:http://server/p 00758+ ithos/rest/usern...@domain/files/},folders:[],modificationDate:1 00828+ 290002859579,deleted:false} Hmmm, what to do with this output now? Where is the error? :) -- Jakov Sosic
Re: haproxy gives 502 on links with utf-8 chars?!
Looks like the field X-GSS-Metadata: Has utf-8 encoded characters, I don't know if that's valid or not, I think not. -- Germán Gutiérrez OLX Operation Center OLX Inc. Buenos Aires - Argentina Phone: 54.11.4775.6696 Mobile: 54.911.5669.6175 Skype: errare_est Email: germ...@olx.com Delivering common sense since 1969 Epoch Fail!. The Nature is not amiable; It treats impartially to all the things. The wise person is not amiable; He treats all people impartially. (a)bort (r)etry (e)pic fail?
Re: haproxy gives 502 on links with utf-8 chars?!
On 11/19/2010 03:07 PM, German Gutierrez :: OLX Operation Center wrote: Looks like the field X-GSS-Metadata: Has utf-8 encoded characters, I don't know if that's valid or not, I think not. From wikipedia: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields Accept-Charset Character sets that are acceptable Accept-Charset: utf-8 So I guess I need to force somehow server to set this HTTP header option? -- Jakov Sosic
Re: haproxy gives 502 on links with utf-8 chars?!
Accept-* headers talk about what the ends of the connection want in terms of page content. What is allowed in the headers themselves is a different part of the spec, not spec'd by the content of a header but by the spec itself. Many HTTP/1.1 header field values consist of words separated by LWS or special characters. These special characters MUST be in a quoted string to be used within a parameter value (as defined in section 3.6). Unrecognized header fields [anything like X-*] are treated as entity-header fields. So X-GSS-Metadata is considered an entity-header AFAICT. The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and MUST be forwarded by transparent proxies. 7.2.1 talks about encoding the entity body but not entity headers. I didn't know about trailing headers trailers - Willie, is haproxy coded to watch for those? As is the answer here: http://stackoverflow.com/questions/1361604/how-to-encode-utf8-filename-for-http-headers-python-django It looks like you can't do that On 11/19/10 6:13 AM, Jakov Sosic wrote: On 11/19/2010 03:07 PM, German Gutierrez :: OLX Operation Center wrote: Looks like the field X-GSS-Metadata: Has utf-8 encoded characters, I don't know if that's valid or not, I think not. From wikipedia: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields Accept-Charset Character sets that are acceptable Accept-Charset: utf-8 So I guess I need to force somehow server to set this HTTP header option?
Re: haproxy gives 502 on links with utf-8 chars?!
On Fri, Nov 19, 2010 at 03:05:17PM +0100, Jakov Sosic wrote: On 11/19/2010 01:47 PM, Willy Tarreau wrote: echo show errors | socat stdio unix-connect:/var/run/haproxy.sock # echo show errors | socat stdio unix-connect:/var/run/haproxy.sock [19/Nov/2010:15:01:56.646] backend www (#1) : invalid response src aaa.bbb.ccc.ddd, session #645, frontend www (#1), server backend-srv1 (#1) response length 857 bytes, error at position 268: 0 HTTP/1.1 200 OK\r\n 00017 Date: Fri, 19 Nov 2010 14:01:56 GMT\r\n 00054 Server: Apache/2.2.3 (CentOS)\r\n 00085 X-Powered-By: Servlet 2.5; JBoss-5.0/JBossWeb-2.1\r\n 00136 Expires: -1\r\n 00149 X-GSS-Metadata: {creationDate:1290002859579,createdBy:ngara...@sr 00219+ ce.hr,modifiedBy:usern...@domain,name:a\r\x07\x11~,owner: 00282+ usern...@domain,modificationDate:1290002859579,deleted:false}\r 00350+ \n 00351 Content-Length: 418\r\n 00372 Connection: close\r\n 00391 Content-Type: application/json;charset=UTF-8\r\n 00437 \r\n 00439 {files:[],creationDate:1290002859579,createdBy:usern...@domain 00509+ ,modifiedBy:usern...@domain,readForAll:false,name:\xC5\xA1 00572+ \xC4\x8D\xC4\x87\xC4\x91\xC5\xBE,permissions:[{modifyACL:true,wr 00618+ ite:true,read:true,user:usern...@domain}],owner:usern...@domain 00688+ ce.hr,parent:{name:User User,uri:http://server/p 00758+ ithos/rest/usern...@domain/files/},folders:[],modificationDate:1 00828+ 290002859579,deleted:false} Excellent, we have it now. 00149 X-GSS-Metadata: {creationDate:1290002859579,createdBy:ngara...@sr 00219+ ce.hr,modifiedBy:usern...@domain,name:a\r\x07\x11~,owner: 00282+ usern...@domain,modificationDate:1290002859579,deleted:false}\r 00350+ \n You see above, position 268 ? It's the \x07 just after the \r on the second line. The issue is not related to UTF-8 at all, those are just forbidden characters possibly resulting from corrupted memory. The \r prefixes an end of header and may only be followed by a \n. From RFC2616: message-header = field-name : [ field-value ] field-name = token field-value= *( field-content | LWS ) field-content = the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string token = 1*any CHAR except CTLs or separators quoted-string = ( *(qdtext | quoted-pair ) ) qdtext = any TEXT except quoted-pair= \ CHAR TEXT = any OCTET except CTLs, but including LWS separators = ( | ) | | | @ | , | ; | : | \ | | / | [ | ] | ? | = | { | } | SP | HT CHAR = any US-ASCII character (octets 0 - 127) CTL= any US-ASCII control character (octets 0 - 31) and DEL (127) So as you can see, CTL characters cannot appear anywhere unescaped (an HTTPBIS spec refines that further by clearly insisting on the fact that those chars may not even be escaped). So clearly those 0x0D 0x07 0x11 characters at position 268 are forbidden here and break the parsing of the line. What I suspect is that the characters were UTF-8 encoded in the database, but the application server stripped the 8th bit before putting them on the wire, which resulted in what you have. That's just a pure guess, of course. Another possibility is that those bytes represent an integer value that was accidentely outputted with a %c formatting instead of a %d. We can't even let that pass with option accept-invalid-http-response because the issue will be even worse for characters that are returned as 0x0D 0x0A, that will end the line and start a new header with the remaining data. The only solution right here is to try to see where it breaks in the application (maybe it's a memory corruption issue after all) and to fix it ASAP. Hoping this helps, Willy
Re: conntrack
To just filter out conntrack for the proxy ports you can use: Chain PREROUTING (policy ACCEPT 415M packets, 244G bytes) pkts bytes target prot opt in out source destination 170M 22G NOTRACKtcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 155M 172G NOTRACKtcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 Chain OUTPUT (policy ACCEPT 428M packets, 255G bytes) pkts bytes target prot opt in out source destination 120M 170G NOTRACKtcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp spt:80 196M 25G NOTRACKtcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 In this case, both the front-end and backend servers are on port 80. This will drop the tracked entires to a low count: [kbra...@lb01: ~] cat /proc/sys/net/netfilter/nf_conntrack_count 185 Keep in my this means you are not using NAT (there is a such thing a stateless NAT but I couldn't find a working version in recent kernels) and no stateful rules as Willy pointed out. -Kyle Brandt On Tue, Nov 16, 2010 at 6:21 AM, Willy Tarreau w...@1wt.eu wrote: On Tue, Nov 09, 2010 at 09:51:28AM -0500, Ariel wrote: I've seen a few times on this list people recommending to completely from the conntrack functionality of iptables. I only use iptables to block all ports except for 80/443 and 22 only to specific locations, so I believe I am not doing any connection tracking, and I don't see why I would need to on a reverse-proxy whos only purpose is to push through as many connections as possible. Exactly ! So it's generally safe just to completely disable this module? Yes. You should simply filter on the destination port then. However, keep in mind that without conntrack, you want to have very few rules. Is there a way with sysctl to remove it completely or do I have to rmmod it? There's no sysctl, you can rmmod it, as well as you can blacklist it in your modprobe.conf. You should ensure that none of your rules makes any reference to it (eg: no -m state rule). Also, are they any cases where it might be advantageous to use conntrack in your iptables for an haproxy server? If you need NAT, or if you have to implement many rules and only want to apply them on SYN packets, then you need conntrack. But in general, conntrack is used to bring the stateful mechanism that a proxy already provides, which is why they're rarely needed side by side. Regards, Willy
Re: (haproxy) How-TO get HAPROXY to balanace 2 SSL encypted Webservers ?
What if I don't need to encrypt the traffic between the Haproxy front and the 2 backend servers? Is there a way just to have HAProxy passthrough any and all traffic and balance them? sort of like LVS works on Layer 4. tia. On Tue, 16 Nov 2010 06:18:03 -0500 Willy Tarreau w...@1wt.eu wrote: On Mon, Nov 15, 2010 at 03:38:58PM -0500, t...@hush.com wrote: Thanks. Are there any config examples I can take a look at? Specifically having HAPROXY load balance 2 backend SSL encrypted tomcat servers. As per your message, I would not be able to use POUND. if you need to re-encrypt the traffic between haproxy and tomcat, then you can't do that much easily. I've already done it with stunnel, but the overall chain gets quite complicated : client | | HTTPS/443 v stunnel in server mode | | HTTP/localhost:8443 v haproxy | | HTTP/localhost:8000+#server v stunnel in client mode | | HTTPS/server:443 v server How can I configure HAPROXY to only balance the 2 servers' port 443 and apply stickiness to the source IP's? You can do that in plain TCP mode, so there won't be any HTTP processing. Source IP stickiness can be configured using the stick-tables. An alternative generally is to simply perform a source IP hash. Version 1.5-dev3 makes it possible to use SSL-ID for stickiness, which is more reliable than the IP address, but is limited in time by some browsers. A solution could be to mix IP hashing with SSL-ID stickiness in order to get the best of both worlds: as long as at least one of them remains, stickiness is maintained. are there any examples I can look at? There are a bit in the doc, but really not that much. Look for stick-table. How can I modify the below config to also passthrough, balance and create the sticky sessions for SSL also? currently our port 80 load balancing looks like this: (entire config) global log 127.0.0.1:514 local7 # only send important events maxconn 4096 user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 5 srvtimeout 5 stats enable stats uri /stats frontend http-in bind *:80 acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com use_backend ww2_test1_com if is_ww2_test1_com backend ww2_test1_com balance roundrobin cookie SERVERID insert nocache indirect option httpchk option httpclose option forwardfor server Server1 10.10.10.11:80 cookie Server1 server Server2 10.10.10.12:80 cookie Server2 For port 443, it would approximately look like this (untested) : frontend https-in mode tcp bind :443 default_backend bk-https backend bk-https mode tcp balance src option ssl-hello-chk server Server1 10.10.10.11:443 check server Server2 10.10.10.12:443 check But be careful, your servers will only log haproxy's IP address, and this can clearly become an issue. Regards, Willy
Re: Support for SSL
Here's an interesting blog post by a Google engineer about how they rolled out SSL for many of their services with very little additional CPU and network overhead. Specifically, he claims that On our production frontend machines, SSL/TLS accounts for less than 1% of the CPU load, less than 10KB of memory per connection and less than 2% of network overhead. http://www.imperialviolet.org/2010/06/25/overclocking-ssl.html -Bryan On Fri, Nov 19, 2010 at 4:54 AM, Willy Tarreau w...@1wt.eu wrote: On Wed, Nov 17, 2010 at 09:46:05AM -0500, John Marrett wrote: Bedis, Cause using the cores to decrypt traffic would reduce drastically overall performance. Well, this is what we saw on our HTTP cache server (running CentOS) on 8 cores hardware: when enabling SSL, the performance were so bad that So we kept our old Nortel vpn 3050 to handle the SSL traffic. I'm astonished to hear that you had these kinds of issues on modern hardware. We stopped using dedicated SSL hardware quite some time ago. I'm not surprized at all. The issue generally lies in mixing high latency processing (eg: SSL) with low latency (eg: HTTP processing). When your CPUs are stuck for 200 microseconds processing an SSL connection, you can try to do whatever you want, all pending HTTP processing will be stuck that long, which will considerably limit the request rate. One of the solution sometimes is to dedicate some CPUs to slow processing and others to fast processing, but this is not always possible. Cheers, Willy
Re: appsession does not work in Haproxy 1.4.9
Hi Willy, Great thanks for your suggestion. I will try it now nad give you feedback later. Best regards, Hogan From hogan's iPhone On 2010-11-19, at 19:02, Willy Tarreau w...@1wt.eu wrote: Hi Hogan, I have good news. I found two bugs that could cause what you describe in case of out of memory conditions, and I could trigger them with your config by limiting my amount of memory. One of them is triggered by the cookie capture. It it failed to allocate memory to store the cookie, it would emit an alert but still use it. That was the case only for the response cookies, the request cookies has the correct fix. Not sure how that slipped through fixes, but it's fixed now. The other one was caused by a risk of freeing a non-allocated pool in the appsession in case of memory shortage : some fields were released while not yet allocated, resulting in a segfault too during the call to the libc's free(). However, I found a third bug with my glibc : the call to regexec() spins forever when it fails to allocate memory. It is very possible that with your libc, instead of looping it crashes. And the only way to have this code called with your config is precisely by adding the reqirep rule : 0x6f755d7d in mmap () from /lib/libc.so.6 (gdb) bt #0 0x6f755d7d in mmap () from /lib/libc.so.6 #1 0x6f703bb4 in sYSMALLOc () from /lib/libc.so.6 #2 0x6f702835 in _int_realloc () from /lib/libc.so.6 #3 0x6f701100 in realloc () from /lib/libc.so.6 #4 0x6f731d6f in re_string_realloc_buffers () from /lib/libc.so.6 #5 0x6f73dfb3 in extend_buffers () from /lib/libc.so.6 #6 0x6f73b222 in transit_state () from /lib/libc.so.6 #7 0x6f739a6f in check_matching () from /lib/libc.so.6 #8 0x6f7392a5 in re_search_internal () from /lib/libc.so.6 #9 0x6f73874d in regexec@@GLIBC_2.3.4 () from /lib/libc.so.6 #10 0x0806b8da in apply_filter_to_req_line (t=0x0, req=0x8ba5da8, exp=0x8b07b08) at src/proto_http.c:5534 #11 0x0806badc in apply_filters_to_request (s=0x8ba5650, req=0x8ba5da8, px=0x8afec80) at src/proto_http.c:5649 #12 0x080682fe in http_process_req_common (s=0x8ba5650, req=0x8ba5da8, an_bit=-12, px=0x8afec80) at src/proto_http.c:3013 #13 0x0807e776 in process_session (t=0x8ba5a08) at src/session.c:1068 #14 0x0804ec8d in process_runnable_tasks (next=0x77bcd29c) at src/task.c:234 #15 0x0804b005 in run_poll_loop () at src/haproxy.c:974 #16 0x0804b389 in main (argc=6, argv=0x77bcd334) at src/haproxy.c:1255 I'm used to build with libpcre which is much faster than libc and which I never got to fail. So what I'm suggesting for your case : 1) I'm certain that you're running out of memory, maybe because of too many learned cookies, or maybe because your machine is under-sized for the amount of traffic that passes through. However, since you're saying it takes one hour to die, I suspect that it's a combination of both. You need much memory, and the appsession that slowly piles up progressively reduces the amount of memory available until it dies. So you must find why you have so little memory (maybe a ulimit -m or something like this). 2) rebuild with support for libpcre, which handles out of memory conditions much better than libc. An out of memory should in theory never happen, but since it does in your case, let's be careful. 3) apply the two patches below to your current version to fix the two bugs : http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=62e360 http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=75eae4 Best regards, Willy
Re: Tarpit option to return content instead of reject request
Hi, * Willy Tarreau w...@1wt.eu [2010-11-11 22:28+0100] On Thu, Nov 11, 2010 at 01:02:02PM -0800, Gerald Oskoboiny wrote: Basically we are trying to rate-limit excessively frequent requests for certain resources on our site (depending on the request URI), by injecting an artificial delay. Is that straightforward to do somehow? You can make use of the tcp-request content rules for that. The basic idea is that you set up a request inspection delay and then you write a rule that validates some requests immediately and other ones only once the delay is past. That would approximately look like this : tcp-request inspect-delay 1s acl slow_url url_beg /photos /images /login.php tcp-request content accept if HTTP !slow_url tcp-request content accept if HTTP slow_url WAIT_END Excellent, thanks a lot for the quick reply and helpful info. We switched www.w3.org to haproxy a couple weeks ago and it has been fantastic so far, though we are still improving our config. Another very common way of regulating accesses to some expensive URLs is to use two distinct backends, one with normal numbers of concurrent connections, and another one with very low numbers (eg: 1 or 2). By directing the expensive URLs to the second backend, you serialize them. This is particularly efficient when those requests involve a high processing time on the server (eg: search engine). Yes, I considered doing something like that if no other options were available. (but the one suggested above works great) Also, version 1.5 provides much more flexibility for that depending on what it the initial cause for your wish of slowing down some requests, because you can slow down the IPs that are accessing some URLs too much if that is of any help. We have a number of ongoing issues that we expect haproxy will help with, but in this case we are trying to deal with automated software that fetches the same resources from our site thousands of times a day. If we continue to serve those resources quickly it gives people little incentive to fix their apps to use a cache, so we're looking at imposing an artificial delay. more details if you are interested: http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic (DTDs are now served from our site with a 15-sec delay) -- Gerald Oskoboiny http://www.w3.org/People/Gerald/ World Wide Web Consortium (W3C)http://www.w3.org/ tel:+1-604-906-1232 mailto:ger...@w3.org
Re: appsession does not work in Haproxy 1.4.9
Hi Willy, I update all your suggestion 1. compile with USE_PCRE=1 2. set ulimit -m unlimited 3. use the two patches Here is my ulimit -a list # ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited do you think I need to change the max locked memory to a bigger value? FYI: Here is the hang information when I disable the regireg command in the configuration file, The error code change from 6 to 4: haproxy[26980]: segfault at 20334000 rip 003ef9c7c3db rsp 7fff8bd01398 error 4 haproxy[1310]: segfault at 21151000 rip 003ef9c7c4a4 rsp 7fff9a1167b8 error 4 haproxy[19029]: segfault at 19f33000 rip 003ef9c7c4ef rsp 7fff8982e158 error 4 haproxy[25044]: segfault at 20737000 rip 003ef9c7c4df rsp 7fff73962638 error 4 I will monitor the haproxy status and give your feedback it it works well. :) Thanks again for your help and patience. Best Regards, Hogan On Sat, Nov 20, 2010 at 9:28 AM, hogan.yu hogan...@icebreakersoftware.comwrote: Hi Willy, Great thanks for your suggestion. I will try it now nad give you feedback later. Best regards, Hogan From hogan's iPhone On 2010-11-19, at 19:02, Willy Tarreau w...@1wt.eu wrote: Hi Hogan, I have good news. I found two bugs that could cause what you describe in case of out of memory conditions, and I could trigger them with your config by limiting my amount of memory. One of them is triggered by the cookie capture. It it failed to allocate memory to store the cookie, it would emit an alert but still use it. That was the case only for the response cookies, the request cookies has the correct fix. Not sure how that slipped through fixes, but it's fixed now. The other one was caused by a risk of freeing a non-allocated pool in the appsession in case of memory shortage : some fields were released while not yet allocated, resulting in a segfault too during the call to the libc's free(). However, I found a third bug with my glibc : the call to regexec() spins forever when it fails to allocate memory. It is very possible that with your libc, instead of looping it crashes. And the only way to have this code called with your config is precisely by adding the reqirep rule : 0x6f755d7d in mmap () from /lib/libc.so.6 (gdb) bt #0 0x6f755d7d in mmap () from /lib/libc.so.6 #1 0x6f703bb4 in sYSMALLOc () from /lib/libc.so.6 #2 0x6f702835 in _int_realloc () from /lib/libc.so.6 #3 0x6f701100 in realloc () from /lib/libc.so.6 #4 0x6f731d6f in re_string_realloc_buffers () from /lib/libc.so.6 #5 0x6f73dfb3 in extend_buffers () from /lib/libc.so.6 #6 0x6f73b222 in transit_state () from /lib/libc.so.6 #7 0x6f739a6f in check_matching () from /lib/libc.so.6 #8 0x6f7392a5 in re_search_internal () from /lib/libc.so.6 #9 0x6f73874d in regexec@@GLIBC_2.3.4 () from /lib/libc.so.6 #10 0x0806b8da in apply_filter_to_req_line (t=0x0, req=0x8ba5da8, exp=0x8b07b08) at src/proto_http.c:5534 #11 0x0806badc in apply_filters_to_request (s=0x8ba5650, req=0x8ba5da8, px=0x8afec80) at src/proto_http.c:5649 #12 0x080682fe in http_process_req_common (s=0x8ba5650, req=0x8ba5da8, an_bit=-12, px=0x8afec80) at src/proto_http.c:3013 #13 0x0807e776 in process_session (t=0x8ba5a08) at src/session.c:1068 #14 0x0804ec8d in process_runnable_tasks (next=0x77bcd29c) at src/task.c:234 #15 0x0804b005 in run_poll_loop () at src/haproxy.c:974 #16 0x0804b389 in main (argc=6, argv=0x77bcd334) at src/haproxy.c:1255 I'm used to build with libpcre which is much faster than libc and which I never got to fail. So what I'm suggesting for your case : 1) I'm certain that you're running out of memory, maybe because of too many learned cookies, or maybe because your machine is under-sized for the amount of traffic that passes through. However, since you're saying it takes one hour to die, I suspect that it's a combination of both. You need much memory, and the appsession that slowly piles up progressively reduces the amount of memory available until it dies. So you must find why you have so little memory (maybe a ulimit -m or something like this). 2) rebuild with support for libpcre, which handles out of memory conditions much better
Re: appsession does not work in Haproxy 1.4.9
Hi Hogan, [I removed Cyril from the CC, he might get bored and he's on the list] On Sat, Nov 20, 2010 at 10:46:53AM +0800, Hogan Yu wrote: Hi Willy, I update all your suggestion 1. compile with USE_PCRE=1 2. set ulimit -m unlimited 3. use the two patches perfect. Here is my ulimit -a list # ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 139264 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited do you think I need to change the max locked memory to a bigger value? No, because we're not using any locked memory. All your settings look good. FYI: Here is the hang information when I disable the regireg command in the configuration file, The error code change from 6 to 4: haproxy[26980]: segfault at 20334000 rip 003ef9c7c3db rsp 7fff8bd01398 error 4 haproxy[1310]: segfault at 21151000 rip 003ef9c7c4a4 rsp 7fff9a1167b8 error 4 haproxy[19029]: segfault at 19f33000 rip 003ef9c7c4ef rsp 7fff8982e158 error 4 haproxy[25044]: segfault at 20737000 rip 003ef9c7c4df rsp 7fff73962638 error 4 OK, those might very well be the two issues that I fixed with the two patches then. I will monitor the haproxy status and give your feedback it it works well. :) Thanks again for your help and patience. No, thanks to you first for your fast reports and for taking the risks to run buggy code ! Cheers, Willy
Re: Tarpit option to return content instead of reject request
Hi Gerald, On Fri, Nov 19, 2010 at 06:42:17PM -0800, Gerald Oskoboiny wrote: We have a number of ongoing issues that we expect haproxy will help with, Feel free to expose your issues here, there are still a number of missing features and sometimes a few lines of code may save a lot of time. but in this case we are trying to deal with automated software that fetches the same resources from our site thousands of times a day. If we continue to serve those resources quickly it gives people little incentive to fix their apps to use a cache, so we're looking at imposing an artificial delay. more details if you are interested: http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic (DTDs are now served from our site with a 15-sec delay) OK now I understand your need of a delay. In this case I too think it's the best thing to do because it becomes noticeable on the client side. And indeed, this one below takes 15 seconds to respond : http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd :-) Regards, Willy