Re: Help with cleaning up our error log output and request error counts
Hi Brendon, On Wed, Dec 26, 2012 at 03:09:37PM -0500, Brendon Colby wrote: Greetings! (Apparently GMail IPs are now listed in SORBs, so when I first sent this through the Gmail web interface I got a bounce; I had to use Thunderbird and IMAP.) We just replaced our old commercial load balancer (end of life) with haproxy running on an HP DL360p G8 server with a quad-core Xeon E5-2609 2.4GHz processor (soon to be two for redundancy). Our site is mainly media - SWF files of movies and games, mp4 movies, audio, art, etc. We are serving approximately 2,330 req/s right now, for example, at over 1Gbps of outgoing traffic (we average 600Mbps to 1Gbps+ daily). Everything has been running great for a couple of weeks, and now we're just tidying up a few things like logging and these request error counts we've been seeing. We have multiple sub-domains: one hosting css, one hosting images, etc. My goals have been to: 1. Figure out what's causing the request errors and see if we can tweak something on the server side to stop them 2. Reduce the error log output to a manageable level, so we can keep it enabled all the time and monitor it We haven't been able to keep full error logging enabled due to the volume of errors being logged (200-300 per second). When we first enabled error logging it was flooded with entries like these: Dec 21 21:00:01 localhost haproxy[16034]: x.x.x.x:50714 [21/Dec/2012:20:59:56.140] http-in http-in/NOSRV -1/-1/-1/-1/12343 400 212 - - CR-- 1913/1903/0/0/0 0/0 BADREQ This isn't the exact error but close - a CR error with a seemingly random timeout. show errors reported nothing of value. Here's what our defaults looked like: defaults backlog 1 mode http # This option helps provide better statistics / graphs. option contstats option splice-auto timeout client 30s timeout connect 10s timeout server 30s I did some research and it seemed to me that these errors were simply the browser closing connections. For example, when I opened our site with Chrome, after about 12 seconds I would see several CR errors in the logs from our office IP. I figured these must be keep-alive connections (or something like that) that were just timing out on the client side (hence the CR or client side error). I put in a timeout http-request of five seconds and haproxy then started logging cR errors with a 5000ms timeout value (the same log values above essentially, with cR instead of CR). What this told me was that now instead of the client disconnecting (CR) haproxy was proactively disconnecting (cR) and throwing a 408 error, which made sense. Right now http-request is set to 15 seconds and we're still seeing 100s of errors per second. Next, I read through almost the entire haproxy manual (very good docs!) and found this section under timeout http-keep-alive: There is also another difference between the two timeouts : when a connection expires during timeout http-keep-alive, no error is returned, the connection just closes. If the connection expires in http-request while waiting for a connection to complete, a HTTP 408 error is returned. I thought for sure using timeout http-keep-alive 1 would stop the cR/408 errors, but it didn't. option dontlognull stops them from being logged, but I see we're still getting 200-300 request errors per second on the frontend. The docs say not to use option dontlognull because it could mask attacks and such. I agree with this and don't want to leave this in. What's puzzling is that these cR/408 errors seem to be coming from regular site traffic and browser behavior, not an attack. Unless I'm mistaken, the way we have haproxy configured right now, we shouldn't be seeing these cR/408 errors. I will post the relevant pieces of our configuration below. Please let me know if I'm missing something here, because at this point I'm stuck! Thank you for the very well detailed analysis. I believe that some browsers nowadays tend to proactively establish connections to visited websites, just in case they will need them later. Since Chrome does everything it can to reduce page load time, it very likely is something it's doing. This could explain exactly what you're observing : a new connection over which nothing is transferred that it closed when the user leaves the site (hence the random delay). If you can reproduce the behaviour with your browser, I think that dontlognull will be your only solution and that we'll have to update the doc to indicate that browsers have adopted such an internet-unfriendly behaviour that it's better to leave the option on. What I don't like with proactively opened connections is that they're killing servers with 10-100 times the load they would have to sustain and that even small sites might experience issues with this. If you see 200 of them per second and they last 5s on average, it means
Re: Help with cleaning up our error log output and request error counts
We have the same exact problem, only that dontlognull is not working for us neither for some reason (we have 1 byte requests containing a single byte - NULL - maybe something affected from a firewall or other device tunneling the traffic to our LB). We also concluded this is something the browsers are doing automatically (especially when option http-server-close is in use). If you find any other solution not including the dontlognull option we would love to hear about it. On Wed, Dec 26, 2012 at 11:38 PM, Willy Tarreau w...@1wt.eu wrote: Hi Brendon, On Wed, Dec 26, 2012 at 03:09:37PM -0500, Brendon Colby wrote: Greetings! (Apparently GMail IPs are now listed in SORBs, so when I first sent this through the Gmail web interface I got a bounce; I had to use Thunderbird and IMAP.) We just replaced our old commercial load balancer (end of life) with haproxy running on an HP DL360p G8 server with a quad-core Xeon E5-2609 2.4GHz processor (soon to be two for redundancy). Our site is mainly media - SWF files of movies and games, mp4 movies, audio, art, etc. We are serving approximately 2,330 req/s right now, for example, at over 1Gbps of outgoing traffic (we average 600Mbps to 1Gbps+ daily). Everything has been running great for a couple of weeks, and now we're just tidying up a few things like logging and these request error counts we've been seeing. We have multiple sub-domains: one hosting css, one hosting images, etc. My goals have been to: 1. Figure out what's causing the request errors and see if we can tweak something on the server side to stop them 2. Reduce the error log output to a manageable level, so we can keep it enabled all the time and monitor it We haven't been able to keep full error logging enabled due to the volume of errors being logged (200-300 per second). When we first enabled error logging it was flooded with entries like these: Dec 21 21:00:01 localhost haproxy[16034]: x.x.x.x:50714 [21/Dec/2012:20:59:56.140] http-in http-in/NOSRV -1/-1/-1/-1/12343 400 212 - - CR-- 1913/1903/0/0/0 0/0 BADREQ This isn't the exact error but close - a CR error with a seemingly random timeout. show errors reported nothing of value. Here's what our defaults looked like: defaults backlog 1 mode http # This option helps provide better statistics / graphs. option contstats option splice-auto timeout client 30s timeout connect 10s timeout server 30s I did some research and it seemed to me that these errors were simply the browser closing connections. For example, when I opened our site with Chrome, after about 12 seconds I would see several CR errors in the logs from our office IP. I figured these must be keep-alive connections (or something like that) that were just timing out on the client side (hence the CR or client side error). I put in a timeout http-request of five seconds and haproxy then started logging cR errors with a 5000ms timeout value (the same log values above essentially, with cR instead of CR). What this told me was that now instead of the client disconnecting (CR) haproxy was proactively disconnecting (cR) and throwing a 408 error, which made sense. Right now http-request is set to 15 seconds and we're still seeing 100s of errors per second. Next, I read through almost the entire haproxy manual (very good docs!) and found this section under timeout http-keep-alive: There is also another difference between the two timeouts : when a connection expires during timeout http-keep-alive, no error is returned, the connection just closes. If the connection expires in http-request while waiting for a connection to complete, a HTTP 408 error is returned. I thought for sure using timeout http-keep-alive 1 would stop the cR/408 errors, but it didn't. option dontlognull stops them from being logged, but I see we're still getting 200-300 request errors per second on the frontend. The docs say not to use option dontlognull because it could mask attacks and such. I agree with this and don't want to leave this in. What's puzzling is that these cR/408 errors seem to be coming from regular site traffic and browser behavior, not an attack. Unless I'm mistaken, the way we have haproxy configured right now, we shouldn't be seeing these cR/408 errors. I will post the relevant pieces of our configuration below. Please let me know if I'm missing something here, because at this point I'm stuck! Thank you for the very well detailed analysis. I believe that some browsers nowadays tend to proactively establish connections to visited websites, just in case they will need them later. Since Chrome does everything it can to reduce page load time, it very likely is something it's doing. This could explain exactly what you're observing : a new connection over which nothing is transferred that it closed when the
Re: Help with cleaning up our error log output and request error counts
On Dec 26, 2012, at 4:38 PM, Willy Tarreau w...@1wt.eu wrote: Hi Brendon, Thank you for the very well detailed analysis. I believe that some browsers nowadays tend to proactively establish connections to visited websites, just in case they will need them later. Since Chrome does everything it can to reduce page load time, it very likely is something it's doing. This could explain exactly what you're observing : a new connection over which nothing is transferred that it closed when the user leaves the site (hence the random delay). I was thinking that this is just standard browser behavior too. IE also does this - it just seems to open fewer connections. This is why I was confused and thought I was missing something, because it seems like normal browser behavior even though the docs indicated that it should only happen during an attack or some other anomalous event. If you can reproduce the behaviour with your browser, I think that dontlognull will be your only solution and that we'll have to update the doc to indicate that browsers have adopted such an internet-unfriendly behaviour that it's better to leave the option on. What I don't like with proactively opened connections is that they're killing servers with 10-100 times the load they would have to sustain and that even small sites might experience issues with this. If you see 200 of them per second and they last 5s on average, it means you're constantly having 5000 idle connections just because of this. Many web servers can't handle this :-/ I can see smaller sites having a hard time with this! Before we added timeout http-request we were seeing over 22K established connections to the haproxy server. That's why we have such a high maxconn on our frontend - we hit several limits once we went live (which reminds me to lower it) and had to keep increasing it. Once we added the timeout though, established connections plummeted to (as of now, for example) about 5K. The total TCP connections did NOT go down, however, because now most of them are now in TIME_WAIT (92K!). The only thing this seems to affect is our monitoring system which uses netstat to get TCP stats. It sometimes takes almost a minute to run and uses 100% CPU, but otherwise doesn't seem to affect anything so we've left it for now. So if connections are terminated because of timeouts that I have explicitly set, is there any reason to log and count that as a request error? That to me seems like something that could be logged as info for troubleshooting and not counted as an error at all. Just a thought - it's nothing major. With dontlognulls we still see this type of error at a rate of several per second: Dec 26 18:38:17 localhost haproxy[32259]: n.n.n.n:33685 [26/Dec/2012:18:38:17.155] http-in webservers-ngfiles/ngwebNN 10/0/0/1/649 200 31671 - - CD-- 4166/4159/139/31/0 0/0 GET /images/14/o-eternal-o_the-vampire-dragon.jpg HTTP/1.1 This is logged when, for example, I am on the site watching a movie, and I close the browser. To me, this is another event that could be logged at the info level, but I figure there are probably good reasons why it's logged and counted as an error. It is probably just the perfectionist in me that wants the error rate to be nearly 0. :) BTW, your config is really clean, I have nothing to suggest. I wouldn't be surprized if some people reuse it to build their own configs :-) Thanks! My co-worker is responsible for most of the config, so I will be sure to pass this along to him! Best regards, Willy
Re: Help with cleaning up our error log output and request error counts
On Dec 26, 2012, at 5:57 PM, SBD sbd@gmail.com wrote: We have the same exact problem, only that dontlognull is not working for us neither for some reason (we have 1 byte requests containing a single byte - NULL - maybe something affected from a firewall or other device tunneling the traffic to our LB). We also concluded this is something the browsers are doing automatically (especially when option http-server-close is in use). If you find any other solution not including the dontlognull option we would love to hear about it. I remember seeing your e-mail in the archives. Did you get this from sending show errors to the socket? frontend ** (#1): invalid request src **, session #3468, backend NONE (#-1), server NONE (#-1) HTTP internal state 26, buffer flags 0x00909002, event #0 request length 1 bytes, error at position 0: 0 \x00 I don't see this type of error when I run show errors so I'm curious why you're seeing these and I'm not. Brendon
Re: Help with cleaning up our error log output and request error counts
Yes. sometimes I get it and sometimes don't though. As I said this is probably have to do with some other device (hopefully). Another interesting thing is, that we didn't have those kind of requests all the time. It was started soon after changing the configuration from single listen to frontend-backend config style (we have not try to switch it back though) On Wed, Dec 26, 2012 at 4:30 PM, Brendon Colby bren...@newgrounds.comwrote: On Dec 26, 2012, at 5:57 PM, SBD sbd@gmail.com wrote: We have the same exact problem, only that dontlognull is not working for us neither for some reason (we have 1 byte requests containing a single byte - NULL - maybe something affected from a firewall or other device tunneling the traffic to our LB). We also concluded this is something the browsers are doing automatically (especially when option http-server-close is in use). If you find any other solution not including the dontlognull option we would love to hear about it. I remember seeing your e-mail in the archives. Did you get this from sending show errors to the socket? frontend ** (#1): invalid request src **, session #3468, backend NONE (#-1), server NONE (#-1) HTTP internal state 26, buffer flags 0x00909002, event #0 request length 1 bytes, error at position 0: 0 \x00 I don't see this type of error when I run show errors so I'm curious why you're seeing these and I'm not. Brendon
Re: Help with cleaning up our error log output and request error counts
On Wed, Dec 26, 2012 at 07:03:02PM -0500, Brendon Colby wrote: I was thinking that this is just standard browser behavior too. IE also does this - it just seems to open fewer connections. This is why I was confused and thought I was missing something, because it seems like normal browser behavior even though the docs indicated that it should only happen during an attack or some other anomalous event. This stupid behaviour is something very recent. I did not know that IE was doing this too, but after all I'm not surprized, with the browser war... If you can reproduce the behaviour with your browser, I think that dontlognull will be your only solution and that we'll have to update the doc to indicate that browsers have adopted such an internet-unfriendly behaviour that it's better to leave the option on. What I don't like with proactively opened connections is that they're killing servers with 10-100 times the load they would have to sustain and that even small sites might experience issues with this. If you see 200 of them per second and they last 5s on average, it means you're constantly having 5000 idle connections just because of this. Many web servers can't handle this :-/ I can see smaller sites having a hard time with this! Before we added timeout http-request we were seeing over 22K established connections to the haproxy server. That's why we have such a high maxconn on our frontend - we hit several limits once we went live (which reminds me to lower it) and had to keep increasing it. That's really disgusting. Products such as haproxy or nginx can easily deal with that many concurrent connections, but many other legacy servers cannot. Would you mind reporting this to the HTTP working group at the IETF ? The HTTP/1.1 spec is currently being refined and is almost done, but we can still add information there. Good and bad practices can be updated with this experience. The address is ietf-http...@w3.org. Once we added the timeout though, established connections plummeted to (as of now, for example) about 5K. The total TCP connections did NOT go down, however, because now most of them are now in TIME_WAIT (92K!). TIME_WAIT are harmless on the server side. You can easily reach millions without any issues. The only thing this seems to affect is our monitoring system which uses netstat to get TCP stats. It sometimes takes almost a minute to run and uses 100% CPU, but otherwise doesn't seem to affect anything so we've left it for now. There are two commands that you must absolutely never use in a monitoring system : - netstat -a - ipcs -a Both of them will saturate the system and considerably slow it down when something starts to go wrong. For the sockets you should use what's in /proc/net/sockstat. You have all the numbers you want. If you need more details, use ss -a instead of netstat -a, it uses the netlink interface and is several orders of magnitude faster. So if connections are terminated because of timeouts that I have explicitly set, is there any reason to log and count that as a request error? That to me seems like something that could be logged as info for troubleshooting and not counted as an error at all. Just a thought - it's nothing major. This is a real error. It's not because some browsers decided to do stupid things that it's not an error. When you don't count this case not attacks, the first reason for not getting a request is that it is blocked by too short an MTU in some VPNs. It's very important to know that a browser could not send a POST or a request with a large cookie due to a short MTU somewhere. With dontlognulls we still see this type of error at a rate of several per second: Dec 26 18:38:17 localhost haproxy[32259]: n.n.n.n:33685 [26/Dec/2012:18:38:17.155] http-in webservers-ngfiles/ngwebNN 10/0/0/1/649 200 31671 - - CD-- 4166/4159/139/31/0 0/0 GET /images/14/o-eternal-o_the-vampire-dragon.jpg HTTP/1.1 This one happens more frequently and is also an error as the transfer was not complete, as seen from haproxy. This is logged when, for example, I am on the site watching a movie, and I close the browser. To me, this is another event that could be logged at the info level, but I figure there are probably good reasons why it's logged and counted as an error. It is probably just the perfectionist in me that wants the error rate to be nearly 0. :) In fact if you want to bring the error rate to zero by hiding all errors, you're cheating :-) It seems to me that you'd like not to log errors for what happens on the client side, am I wrong ? This could probably make sense in some environments and we could probably think about adding an option to do that. Regards, Willy
Re: help with option httpchk - http-check expect
On 22 November 2012 22:14, Owen Marinas omari...@woozworld.com wrote: option httpchk POST /db/data/ext/feed/graphdb/userFeed HTTP/1.1\r\nContent-Type: application/json\r\nContent-Length: 35\r\n{userId:8, offset:0, limit:1}\r\n It might not be related to your original question, but I think you're missing an extra \r\n that should represent the blank line between the POST's headers and its body. Also, I think the length of 35 might be wrong, as it doesn't take into account the last \r\n (which might be superfluous anyway). HTH, Jonathan -- Jonathan Matthews // Oxford, London, UK http://www.jpluscplusm.com/contact.html
Re: help with option httpchk - http-check expect
Thx for the advice Jonathan Willy's advice from an old post was to make it work with printf+nc in bash first. So I did. the issue is after I added the lines to the backend(below), the server still resported UP eben if the expected string is not there. regards Owen --- backend neo4j-stg option httpchk POST /db/data/ext/feed/graphdb/userFeed HTTP/1.1\r\nHost: 172.23.10.61:7474\r\nAccept: */*\r\nContent-Type: application/json\r\nContent-Length: 35\r\n\r\n{userId:8, offset:0, limit:1} http-check disable-on-404 expect string userId server neo4j-stg-01 172.23.10.61:7474 server neo4j-stg-02 172.23.10.62:7474 balance roundrobin On 12-11-23 05:01 AM, Jonathan Matthews wrote: On 22 November 2012 22:14, Owen Marinas omari...@woozworld.com wrote: option httpchk POST /db/data/ext/feed/graphdb/userFeed HTTP/1.1\r\nContent-Type: application/json\r\nContent-Length: 35\r\n{userId:8, offset:0, limit:1}\r\n It might not be related to your original question, but I think you're missing an extra \r\n that should represent the blank line between the POST's headers and its body. Also, I think the length of 35 might be wrong, as it doesn't take into account the last \r\n (which might be superfluous anyway). HTH, Jonathan
Re: help with option httpchk - http-check expect
On 23 November 2012 17:10, Owen Marinas omari...@woozworld.com wrote: Thx for the advice Jonathan Willy's advice from an old post was to make it work with printf+nc in bash first. So I did. I think your back-end may be being lenient, then :-) the issue is after I added the lines to the backend(below), the server still resported UP eben if the expected string is not there. That's initially because your server lines lack the requisite check flag to enable health checks. The UP represents layer-3 connectivity to the host working. I can't speak as for any problems you see after you fix that - I've never seen POSTs and request bodies being used in a health check before! HTH, Jonathan -- Jonathan Matthews // Oxford, London, UK http://www.jpluscplusm.com/contact.html
Re: help with option httpchk - http-check expect
I hate myself for this my production LB is running haproxy-1.4.15-1 but the Staging haproxy-1.4.8-1 after upgrade its all working now, the POST, and http-check expect works fine. thx all Owen On 12-11-23 12:50 PM, Jonathan Matthews wrote: On 23 November 2012 17:10, Owen Marinas omari...@woozworld.com wrote: Thx for the advice Jonathan Willy's advice from an old post was to make it work with printf+nc in bash first. So I did. I think your back-end may be being lenient, then :-) the issue is after I added the lines to the backend(below), the server still resported UP eben if the expected string is not there. That's initially because your server lines lack the requisite check flag to enable health checks. The UP represents layer-3 connectivity to the host working. I can't speak as for any problems you see after you fix that - I've never seen POSTs and request bodies being used in a health check before! HTH, Jonathan
Re: help with option httpchk - http-check expect
On Fri, Nov 23, 2012 at 01:28:38PM -0500, Owen Marinas wrote: I hate myself for this my production LB is running haproxy-1.4.15-1 but the Staging haproxy-1.4.8-1 after upgrade its all working now, the POST, and http-check expect works fine. Good reason indeed. BTW, keep in mind that even 1.4.15 is quite outdated, there were something like 75 bugs fixed since! Willy
Re: Help with Chrome preconnects
On Wed, Jul 11, 2012 at 07:03:52AM +0200, Baptiste wrote: Hey, Depends at which phase of the health check Chrome maintains the connection opened, you can give a try to HAProxy's content inspection: listen https mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 5s tcp-request content accept if clienthello You need to add this line here : tcp-request content reject because the tcp-request rules default to accept when none match. Otherwise this should indeed work. BTW, someone recently reported to me that chrome sometimes failed on SNI. Now I understand : the guy was mixing SSL and SSH on the same port : if the client does not send anything then it's SSH. Chrome was getting the SSH page from time to time and decided that the server did not support TLS so it then stopped using SNI. Now I understand why it was not sending any handshake, this is because of this new broken behaviour which saves one speculative roundtrip but which also consumes memory on servers for nothing... Regards, Willy
Re: Help with Chrome preconnects
Thanks for all the replies Willy, Baptiste and Lukas. Unfortunately we could not get the tcp content trick to work. Our guess is Chrome preconnect actually does a SSL handshake and so HAProxy does not have a choice and has to engage a Apache worker. We used the following == listen https 0.0.0.0:44 http://0.0.0.0:44453 mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 20s tcp-request content accept if clienthello tcp-request content reject server inst1 127.0.0.1:443 check inter 2000 fall 3 == Lukas's suggestion was really helpful. I do not have any experience with stud/stunnel. So I will get to these in a week or so when I get some time. For now, we are using nginx for SSL termination and going the HAProxy route for all non-SSL. nginx will route from 443 to non-ssl Apache on 80. Hopefully will replace it with HAProxy+stud some time later. Thanks, Vikram On Thu, Jul 12, 2012 at 11:39 AM, Willy Tarreau w...@1wt.eu wrote: On Wed, Jul 11, 2012 at 07:03:52AM +0200, Baptiste wrote: Hey, Depends at which phase of the health check Chrome maintains the connection opened, you can give a try to HAProxy's content inspection: listen https mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 5s tcp-request content accept if clienthello You need to add this line here : tcp-request content reject because the tcp-request rules default to accept when none match. Otherwise this should indeed work. BTW, someone recently reported to me that chrome sometimes failed on SNI. Now I understand : the guy was mixing SSL and SSH on the same port : if the client does not send anything then it's SSH. Chrome was getting the SSH page from time to time and decided that the server did not support TLS so it then stopped using SNI. Now I understand why it was not sending any handshake, this is because of this new broken behaviour which saves one speculative roundtrip but which also consumes memory on servers for nothing... Regards, Willy
Re: Help with Chrome preconnects
Sorry for the confusing config that I gave. I actually meant == listen https 0.0.0.0:44 http://0.0.0.0:4445/3 mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 20s tcp-request content accept if clienthello tcp-request content reject server inst1 127.0.0.1:44 http://127.0.0.1:443/4 check inter 2000 fall 3 == And on 444 I have Apache running in ssl mode. Thanks, Vikram On Fri, Jul 13, 2012 at 2:16 AM, Vikram Nayak vikram.naya...@gmail.comwrote: Thanks for all the replies Willy, Baptiste and Lukas. Unfortunately we could not get the tcp content trick to work. Our guess is Chrome preconnect actually does a SSL handshake and so HAProxy does not have a choice and has to engage a Apache worker. We used the following == listen https 0.0.0.0:44 http://0.0.0.0:44453 mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 20s tcp-request content accept if clienthello tcp-request content reject server inst1 127.0.0.1:443 check inter 2000 fall 3 == Lukas's suggestion was really helpful. I do not have any experience with stud/stunnel. So I will get to these in a week or so when I get some time. For now, we are using nginx for SSL termination and going the HAProxy route for all non-SSL. nginx will route from 443 to non-ssl Apache on 80. Hopefully will replace it with HAProxy+stud some time later. Thanks, Vikram On Thu, Jul 12, 2012 at 11:39 AM, Willy Tarreau w...@1wt.eu wrote: On Wed, Jul 11, 2012 at 07:03:52AM +0200, Baptiste wrote: Hey, Depends at which phase of the health check Chrome maintains the connection opened, you can give a try to HAProxy's content inspection: listen https mode tcp balance roundrobin acl clienthello req_ssl_hello_type 1 # use tcp content accepts to detects ssl client and server hello. tcp-request inspect-delay 5s tcp-request content accept if clienthello You need to add this line here : tcp-request content reject because the tcp-request rules default to accept when none match. Otherwise this should indeed work. BTW, someone recently reported to me that chrome sometimes failed on SNI. Now I understand : the guy was mixing SSL and SSH on the same port : if the client does not send anything then it's SSH. Chrome was getting the SSH page from time to time and decided that the server did not support TLS so it then stopped using SNI. Now I understand why it was not sending any handshake, this is because of this new broken behaviour which saves one speculative roundtrip but which also consumes memory on servers for nothing... Regards, Willy
RE: Help with Chrome preconnects
I would suggest terminating SSL on the haproxy box (with stud in front of it), thus switching haproxy from tcp to http mode. That way, longrunning keepalive-enabled HTTPS sessions terminate there and apache only sees real non-SSL request without blocking any threads. If you would like to avoid terminating SSL on the proxy, you could enable a haproxy+stud on the apache box (apache listens on 80/8080, and stud with haproxy listens on 443). This way, you can still use a dedicated tcp reverse proxy withouth ssl encryption, doing all the ssl work in the backend and you avoid blocking apache threads because you have an event based proxy in front of it (even if it is on the same box). Date: Wed, 11 Jul 2012 07:15:19 +0530 Subject: Help with Chrome preconnects From: vikram.naya...@gmail.com To: haproxy@formilux.org hi, I am using HAProxy 1.4.x infront of Apache 2.2.x. For SSLs, I just do a tcp redirect from port 443. Like == listen ssl-relay 0.0.0.0:443http://0.0.0.0:443 mode tcp balance roundrobin server inst1 machinename:443 check inter 2000 fall 3 == Everything was running fine till Chrome introduced preconnects. I have logged a bug at http://code.google.com/p/chromium/issues/detail?id=87121 Its a fairly long thread but the gist is the following : Chrome does some speculative SSL connects to the backend and does not close the handshake. The problem for us now is that the request goes to an Apache process and that process gets blocked for the entire duration of the timeout! If in httpd.conf we have 60seconds as timeout, there are one or two Apache processes that will get blocked in Reading request state for 60seconds thinking that the chrome user will use the connection! As you can easily see, this is really a drain on the process pool and very soon it maxes out on child processes. Is there anyway HAProxy can help here? As in, is there anyway HAProxy does not open an apache connection till there is any activity on the connection? Please let me know. I guess most systems would have this problem but for some reason, I can not find enough links on google. Or if you can think of other ways of handling this, please let me know that too. Thanks, Vikram
RE: Help with ACL
Hi Baptiste Thank you for pointing that out.. :) After your example I could see what had eluded me in the documentation. From section 7.7. Using ACLs to form conditions: acl url_static path_beg /static /images /img /css acl url_static path_end .gif .png .jpg .css .js acl host_wwwhdr_beg(host) -i www acl host_static hdr_beg(host) -i img. video. download. ftp. ... use_backend static if host_static or host_www url_static ^^^ Perhaps this line (also from section 7.7.) [!]acl1 [!]acl2 ... [!]acln { or [!]acl1 [!]acl2 ... [!]acln } ... should have a few more examples showing and, or and negation in use than just the one. Regards, Jens Dueholm Christensen -Original Message- From: Baptiste [mailto:bed...@gmail.com] Sent: Thursday, March 22, 2012 6:02 AM To: Jens Dueholm Christensen (JEDC) Cc: haproxy@formilux.org Subject: Re: Help with ACL Hi Jens, No need to apologies, you may have helped a few other people ;) You can also do this: acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if acl_myip acl_collector == AND is implicit. regards On Wed, Mar 21, 2012 at 11:46 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Oh.. It just hit me.. I could just do this: acl acl_test src 1.1.1.1 acl acl_test path_beg -f /etc/haproxy/collector_patterns.lst acl acl_test hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if acl_test Sorry for bothering the mailinglist about this - somehow I was focused on reusing my existing acl_collector and never thought about building a new ACL with the correct rules.. :) Regards, Jens Dueholm Christensen From: Jens Dueholm Christensen (JEDC) [jens.dueh...@r-m.com] Sent: 21 March 2012 23:32 To: haproxy@formilux.org Subject: RE: Help with ACL Hi Baptiste I can see I forgot to add some more information to my previous mail.. Existing functionality (ie. ACLs and sorting into backends) and traffic must not be changed. There is a lot of traffic to other parts of the system (ie. for the admin or webservice backend) that comes from the same IP that I'm going to be testing from. Is it possible to bundle ACL's so backend macthing depends on more than one ACL beeing matched?? I'm looking for something like this (here the acl_collectors ACL is the same as in my config): acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if (acl_myip acl_collector) Then only the same traffic that would normally be matched in acl_collector whould be sent to the new_collectors backend if the traffic was comming from 1.1.1.1. Regards, Jens Dueholm Christensen From: Baptiste [bed...@gmail.com] Sent: 21 March 2012 22:02 To: Jens Dueholm Christensen (JEDC) Cc: haproxy@formilux.org Subject: Re: Help with ACL Hi Jens, You can setup 2 ACLs, one with IPs one with your header and use them on the use_backend line: acl myip src 1.1.1.1 1.1.1.2 acl myheader hdr(MyHeader) keyword use_backend acl_collector myip || myheader Note that the use_backend order matters. The first matching will be used. So it's up to you to set them in the best order for your nees. Regards On Wed, Mar 21, 2012 at 9:52 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Hi I'm having trouble wrapping my head around what I belive is a really simple problem. I've got a working HAProxy setup with a few listeners and a few backends and some ACL's that direct traffic accordingly. Now I'm about to add a new backend for some function-testing in this setup, and I want to restrict what ends up there. This is thinned down version of my configuration (oh, global or default-level ACL's be nice..): --- global ... defaults default mode http balance roundrobin listen in-DK bind 127.0.0.1:4431 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin listen in-NO bind 127.0.0.1:4432 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice
Re: Help with ACL
Hi Jens, You can setup 2 ACLs, one with IPs one with your header and use them on the use_backend line: acl myip src 1.1.1.1 1.1.1.2 acl myheader hdr(MyHeader) keyword use_backend acl_collector myip || myheader Note that the use_backend order matters. The first matching will be used. So it's up to you to set them in the best order for your nees. Regards On Wed, Mar 21, 2012 at 9:52 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Hi I'm having trouble wrapping my head around what I belive is a really simple problem. I've got a working HAProxy setup with a few listeners and a few backends and some ACL's that direct traffic accordingly. Now I'm about to add a new backend for some function-testing in this setup, and I want to restrict what ends up there. This is thinned down version of my configuration (oh, global or default-level ACL's be nice..): --- global ... defaults default mode http balance roundrobin listen in-DK bind 127.0.0.1:4431 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin listen in-NO bind 127.0.0.1:4432 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin backend admin server admin1 172.27.80.36:8080 id 1 maxconn 500 check observe layer7 backend webservice server webservice1 172.27.80.37:8080 id 2 maxconn 500 check observe layer7 backend collectors server collector1 172.27.80.38:8080 id 3 maxconn 1000 check observe layer7 server collector1 172.27.80.39:8080 id 4 maxconn 1000 check observe layer7 --- The file /etc/haproxy/collector_patterns.lst contains these 3 lines: --- /collect /answer /LinkCollector --- This new backend I want for testing (let's call it new_collectors) should recieve the traffic the existing ACL acl_collector directs to the backend collectors, but ONLY if that traffic comes from a certain IP or contains a certain HTTP header. How do I manage that? Regards, Jens Dueholm Christensen
RE: Help with ACL
Hi Baptiste I can see I forgot to add some more information to my previous mail.. Existing functionality (ie. ACLs and sorting into backends) and traffic must not be changed. There is a lot of traffic to other parts of the system (ie. for the admin or webservice backend) that comes from the same IP that I'm going to be testing from. Is it possible to bundle ACL's so backend macthing depends on more than one ACL beeing matched?? I'm looking for something like this (here the acl_collectors ACL is the same as in my config): acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if (acl_myip acl_collector) Then only the same traffic that would normally be matched in acl_collector whould be sent to the new_collectors backend if the traffic was comming from 1.1.1.1. Regards, Jens Dueholm Christensen From: Baptiste [bed...@gmail.com] Sent: 21 March 2012 22:02 To: Jens Dueholm Christensen (JEDC) Cc: haproxy@formilux.org Subject: Re: Help with ACL Hi Jens, You can setup 2 ACLs, one with IPs one with your header and use them on the use_backend line: acl myip src 1.1.1.1 1.1.1.2 acl myheader hdr(MyHeader) keyword use_backend acl_collector myip || myheader Note that the use_backend order matters. The first matching will be used. So it's up to you to set them in the best order for your nees. Regards On Wed, Mar 21, 2012 at 9:52 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Hi I'm having trouble wrapping my head around what I belive is a really simple problem. I've got a working HAProxy setup with a few listeners and a few backends and some ACL's that direct traffic accordingly. Now I'm about to add a new backend for some function-testing in this setup, and I want to restrict what ends up there. This is thinned down version of my configuration (oh, global or default-level ACL's be nice..): --- global ... defaults default mode http balance roundrobin listen in-DK bind 127.0.0.1:4431 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin listen in-NO bind 127.0.0.1:4432 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin backend admin server admin1 172.27.80.36:8080 id 1 maxconn 500 check observe layer7 backend webservice server webservice1 172.27.80.37:8080 id 2 maxconn 500 check observe layer7 backend collectors server collector1 172.27.80.38:8080 id 3 maxconn 1000 check observe layer7 server collector1 172.27.80.39:8080 id 4 maxconn 1000 check observe layer7 --- The file /etc/haproxy/collector_patterns.lst contains these 3 lines: --- /collect /answer /LinkCollector --- This new backend I want for testing (let's call it new_collectors) should recieve the traffic the existing ACL acl_collector directs to the backend collectors, but ONLY if that traffic comes from a certain IP or contains a certain HTTP header. How do I manage that? Regards, Jens Dueholm Christensen
RE: Help with ACL
Oh.. It just hit me.. I could just do this: acl acl_test src 1.1.1.1 acl acl_test path_beg -f /etc/haproxy/collector_patterns.lst acl acl_test hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if acl_test Sorry for bothering the mailinglist about this - somehow I was focused on reusing my existing acl_collector and never thought about building a new ACL with the correct rules.. :) Regards, Jens Dueholm Christensen From: Jens Dueholm Christensen (JEDC) [jens.dueh...@r-m.com] Sent: 21 March 2012 23:32 To: haproxy@formilux.org Subject: RE: Help with ACL Hi Baptiste I can see I forgot to add some more information to my previous mail.. Existing functionality (ie. ACLs and sorting into backends) and traffic must not be changed. There is a lot of traffic to other parts of the system (ie. for the admin or webservice backend) that comes from the same IP that I'm going to be testing from. Is it possible to bundle ACL's so backend macthing depends on more than one ACL beeing matched?? I'm looking for something like this (here the acl_collectors ACL is the same as in my config): acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if (acl_myip acl_collector) Then only the same traffic that would normally be matched in acl_collector whould be sent to the new_collectors backend if the traffic was comming from 1.1.1.1. Regards, Jens Dueholm Christensen From: Baptiste [bed...@gmail.com] Sent: 21 March 2012 22:02 To: Jens Dueholm Christensen (JEDC) Cc: haproxy@formilux.org Subject: Re: Help with ACL Hi Jens, You can setup 2 ACLs, one with IPs one with your header and use them on the use_backend line: acl myip src 1.1.1.1 1.1.1.2 acl myheader hdr(MyHeader) keyword use_backend acl_collector myip || myheader Note that the use_backend order matters. The first matching will be used. So it's up to you to set them in the best order for your nees. Regards On Wed, Mar 21, 2012 at 9:52 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Hi I'm having trouble wrapping my head around what I belive is a really simple problem. I've got a working HAProxy setup with a few listeners and a few backends and some ACL's that direct traffic accordingly. Now I'm about to add a new backend for some function-testing in this setup, and I want to restrict what ends up there. This is thinned down version of my configuration (oh, global or default-level ACL's be nice..): --- global ... defaults default mode http balance roundrobin listen in-DK bind 127.0.0.1:4431 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin listen in-NO bind 127.0.0.1:4432 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin backend admin server admin1 172.27.80.36:8080 id 1 maxconn 500 check observe layer7 backend webservice server webservice1 172.27.80.37:8080 id 2 maxconn 500 check observe layer7 backend collectors server collector1 172.27.80.38:8080 id 3 maxconn 1000 check observe layer7 server collector1 172.27.80.39:8080 id 4 maxconn 1000 check observe layer7 --- The file /etc/haproxy/collector_patterns.lst contains these 3 lines: --- /collect /answer /LinkCollector --- This new backend I want for testing (let's call it new_collectors) should recieve the traffic the existing ACL acl_collector directs to the backend collectors, but ONLY if that traffic comes from a certain IP or contains a certain HTTP header. How do I manage that? Regards, Jens Dueholm Christensen
Re: Help with ACL
Hi Jens, No need to apologies, you may have helped a few other people ;) You can also do this: acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if acl_myip acl_collector == AND is implicit. regards On Wed, Mar 21, 2012 at 11:46 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Oh.. It just hit me.. I could just do this: acl acl_test src 1.1.1.1 acl acl_test path_beg -f /etc/haproxy/collector_patterns.lst acl acl_test hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if acl_test Sorry for bothering the mailinglist about this - somehow I was focused on reusing my existing acl_collector and never thought about building a new ACL with the correct rules.. :) Regards, Jens Dueholm Christensen From: Jens Dueholm Christensen (JEDC) [jens.dueh...@r-m.com] Sent: 21 March 2012 23:32 To: haproxy@formilux.org Subject: RE: Help with ACL Hi Baptiste I can see I forgot to add some more information to my previous mail.. Existing functionality (ie. ACLs and sorting into backends) and traffic must not be changed. There is a lot of traffic to other parts of the system (ie. for the admin or webservice backend) that comes from the same IP that I'm going to be testing from. Is it possible to bundle ACL's so backend macthing depends on more than one ACL beeing matched?? I'm looking for something like this (here the acl_collectors ACL is the same as in my config): acl acl_myip src 1.1.1.1 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst use_backend new_collectors if (acl_myip acl_collector) Then only the same traffic that would normally be matched in acl_collector whould be sent to the new_collectors backend if the traffic was comming from 1.1.1.1. Regards, Jens Dueholm Christensen From: Baptiste [bed...@gmail.com] Sent: 21 March 2012 22:02 To: Jens Dueholm Christensen (JEDC) Cc: haproxy@formilux.org Subject: Re: Help with ACL Hi Jens, You can setup 2 ACLs, one with IPs one with your header and use them on the use_backend line: acl myip src 1.1.1.1 1.1.1.2 acl myheader hdr(MyHeader) keyword use_backend acl_collector myip || myheader Note that the use_backend order matters. The first matching will be used. So it's up to you to set them in the best order for your nees. Regards On Wed, Mar 21, 2012 at 9:52 PM, Jens Dueholm Christensen (JEDC) jens.dueh...@r-m.com wrote: Hi I'm having trouble wrapping my head around what I belive is a really simple problem. I've got a working HAProxy setup with a few listeners and a few backends and some ACL's that direct traffic accordingly. Now I'm about to add a new backend for some function-testing in this setup, and I want to restrict what ends up there. This is thinned down version of my configuration (oh, global or default-level ACL's be nice..): --- global ... defaults default mode http balance roundrobin listen in-DK bind 127.0.0.1:4431 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin listen in-NO bind 127.0.0.1:4432 acl acl_collector path_beg -f /etc/haproxy/collector_patterns.lst acl acl_collector hdr_sub(Referer) -f /etc/haproxy/collector_patterns.lst acl acl_webservice path_beg /services use_backend collectors if acl_collector use_backend webservice if acl_webservice default_backend admin backend admin server admin1 172.27.80.36:8080 id 1 maxconn 500 check observe layer7 backend webservice server webservice1 172.27.80.37:8080 id 2 maxconn 500 check observe layer7 backend collectors server collector1 172.27.80.38:8080 id 3 maxconn 1000 check observe layer7 server collector1 172.27.80.39:8080 id 4 maxconn 1000 check observe layer7 --- The file /etc/haproxy/collector_patterns.lst contains these 3 lines: --- /collect /answer /LinkCollector --- This new backend I want for testing (let's call it new_collectors) should recieve the traffic the existing ACL acl_collector directs to the backend collectors, but ONLY if that traffic comes from a certain IP or contains a certain HTTP header. How do I manage that? Regards, Jens Dueholm Christensen
Re: Help determining where the bottleneck is
Thanks for the response. The stats were lagging actually, we determined that the bottleneck was before HAproxy (it ended up being the IPS in front of the network) However, our linux guy suggested the following sysctl changes to enhance throughput which i will share here: net.ipv4.tcp_tw_reuse = 1 net.ipv4.ip_local_port_range = 1024 65023 net.ipv4.tcp_max_syn_backlog = 10 net.ipv4.tcp_fin_timeout = 30 net.ipv4.tcp_synack_retries = 2 net.core.somaxconn = 6 net.core.netdev_max_backlog = 1 On Sun, Jan 29, 2012 at 5:26 AM, Willy Tarreau w...@1wt.eu wrote: Hi Steve, On Tue, Jan 24, 2012 at 08:55:15AM -0800, Steve V wrote: Good morning, Much love for haproxy and many thanks to all who have worked on and contributed to it. We have been using it for several years without issue. However, we have been doing load testing lately and there appears to be a bottleneck. It may not even have to do with haproxy (i dont think it does) but i need to double check anyways just to be thorough and cover all our bases. Hardware: VM running on ESXi, it has 2gigs RAM allocated to it, and 2 CPU's GuestOS: CentOS 5 Haproxy version: 1.4.8 (however, we just upgraded to 1.4.19 last night) Problem: second_proxy is getting hammered by a load test, site performance decreases to the point where the site is barely usable and the majority of pages time out. however, go to a different site that is in the same haproxy config listening on http_proxy going to the same backend server, and the site comes up fine and fast. it seems like something is being throttled or queued somewhere. its possible that it could be an issue behind haproxy on the app servers, but i just want to make sure there is nothing i need to tweak in my config. Here is a snapshot of the haproxy stats page for the slow pool second_proxy http://tinypic.com/r/15887qf/5 Did you tune any sysctl on your system ? Your snapshot reports a peak of 1600 conns/second, but the default kernel settings (somaxconn 128 and tcp_max_syn_backlog 1024) make this hard to reach, so it's very possible that the socket queue is simply full. I'm used to set both between 1 and 2 with good success. There is something you can try to detect if haproxy still accepts connections fine : simply try to connect to the stats URL on the unresponding port. If the stats display properly, then you're stuck on the servers. If the stats do not respond either, then the connection is not accepted. Be careful, you have no maxconn setting in the defaults section, and by default a listen uses 2000. I'm seeing that your snapshot indicates that this limit was not reached, still I wanted to let you know it's going to be the next issue once this one is resolved. here is my haproxy.cfg global maxconn 8096 daemon nbproc 1 stats socket /var/run/haproxy.stat defaults clitimeout 60 srvtimeout 60 Do you realize that this is 10 minutes (we're speaking HTTP here) ? Regards, Willy
Re: Help determining where the bottleneck is
Hi Steve, On Tue, Jan 24, 2012 at 08:55:15AM -0800, Steve V wrote: Good morning, Much love for haproxy and many thanks to all who have worked on and contributed to it. We have been using it for several years without issue. However, we have been doing load testing lately and there appears to be a bottleneck. It may not even have to do with haproxy (i dont think it does) but i need to double check anyways just to be thorough and cover all our bases. Hardware: VM running on ESXi, it has 2gigs RAM allocated to it, and 2 CPU's GuestOS: CentOS 5 Haproxy version: 1.4.8 (however, we just upgraded to 1.4.19 last night) Problem: second_proxy is getting hammered by a load test, site performance decreases to the point where the site is barely usable and the majority of pages time out. however, go to a different site that is in the same haproxy config listening on http_proxy going to the same backend server, and the site comes up fine and fast. it seems like something is being throttled or queued somewhere. its possible that it could be an issue behind haproxy on the app servers, but i just want to make sure there is nothing i need to tweak in my config. Here is a snapshot of the haproxy stats page for the slow pool second_proxy http://tinypic.com/r/15887qf/5 Did you tune any sysctl on your system ? Your snapshot reports a peak of 1600 conns/second, but the default kernel settings (somaxconn 128 and tcp_max_syn_backlog 1024) make this hard to reach, so it's very possible that the socket queue is simply full. I'm used to set both between 1 and 2 with good success. There is something you can try to detect if haproxy still accepts connections fine : simply try to connect to the stats URL on the unresponding port. If the stats display properly, then you're stuck on the servers. If the stats do not respond either, then the connection is not accepted. Be careful, you have no maxconn setting in the defaults section, and by default a listen uses 2000. I'm seeing that your snapshot indicates that this limit was not reached, still I wanted to let you know it's going to be the next issue once this one is resolved. here is my haproxy.cfg global maxconn 8096 daemon nbproc 1 stats socket /var/run/haproxy.stat defaults clitimeout 60 srvtimeout 60 Do you realize that this is 10 minutes (we're speaking HTTP here) ? Regards, Willy
Re: Help determining where the bottleneck is
Hi Steve, Are you using Vsĥere 4 or above? Since you're using option httpclose, I recommand you to move to roundrobin load-balancing algorithm. Actually, HTTP connections to the servers may be very short, so leastconn is not appropriate there and rr will provide a better balancing. Would it be possible to have a screenshot of the whole HAProxy process? I want to compare number from both proxies. cheers
Re: Help determining where the bottleneck is
Good morning, Much love for haproxy and many thanks to all who have worked on and contributed to it. We have been using it for several years without issue. However, we have been doing load testing lately and there appears to be a bottleneck. It may not even have to do with haproxy (i dont think it does) but i need to double check anyways just to be thorough and cover all our bases. Hardware: VM running on ESXi, it has 2gigs RAM allocated to it, and 2 CPU's GuestOS: CentOS 5 Haproxy version: 1.4.8 (however, we just upgraded to 1.4.19 last night) Problem: second_proxy is getting hammered by a load test, site performance decreases to the point where the site is barely usable and the majority of pages time out. however, go to a different site that is in the same haproxy config listening on http_proxy going to the same backend server, and the site comes up fine and fast. it seems like something is being throttled or queued somewhere. its possible that it could be an issue behind haproxy on the app servers, but i just want to make sure there is nothing i need to tweak in my config. Here is a snapshot of the haproxy stats page for the slow pool second_proxy http://tinypic.com/r/15887qf/5 here is my haproxy.cfg global maxconn 8096 daemon nbproc 1 stats socket /var/run/haproxy.stat defaults clitimeout 60 srvtimeout 60 contimeout 9000 option httpclose log 127.0.0.1 local0 log 127.0.0.1 local1 notice listen http_proxy 11.10.15.108:80 mode http acl acl_www18 url_sub www18dir #these are here so we can test an individual server acl acl_www19 url_sub www19dir acl acl_wwwkj3 url_sub wwwkj3dir acl acl_wwwkj5 url_sub wwwkj5dir use_backend www18 if acl_www18 use_backend www19 if acl_www19 use_backend wwwkj3 if acl_wwwkj3 use_backend wwwkj5 if acl_wwwkj5 cookie EWNSERVERID insert balance leastconn option httpchk HEAD /check.txt HTTP/1.0 option forwardfor server EWN-www18 11.10.15.18:80 cookie server18 weight 10 maxconn 2000 check port 8081 server EWN-www19 11.10.15.19:80 cookie server19 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj3 11.10.15.191:80 cookie serverkj3 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj5 11.10.15.194:80 cookie serverkj5 weight 10 maxconn 2000 check port 8081 stats enable stats auth (user):(pass) listen second_proxy 11.10.15.108:8181 mode http acl acl_www18 url_sub www18dir #these are here so we can test an individual server acl acl_www19 url_sub www19dir acl acl_wwwkj3 url_sub wwwkj3dir acl acl_wwwkj4 url_sub wwwkj4dir acl acl_wwwkj5 url_sub wwwkj5dir acl acl_wwwkj6 url_sub wwwkj6dir acl acl_wwwkj7 url_sub wwwkj7dir acl acl_wwwkj8 url_sub wwwkj8dir use_backend www18 if acl_www18 use_backend www19 if acl_www19 use_backend wwwkj3 if acl_wwwkj3 use_backend wwwkj4 if acl_wwwkj4 use_backend wwwkj5 if acl_wwwkj5 use_backend wwwkj6 if acl_wwwkj6 use_backend wwwkj7 if acl_wwwkj7 use_backend wwwkj8 if acl_wwwkj8 cookie EWNSERVERID insert balance leastconn option httpchk HEAD /check.txt HTTP/1.0 option forwardfor server EWN-www18 11.10.15.18:80 cookie server18 weight 10 maxconn 2000 check port 8081 server EWN-www19 11.10.15.19:80 cookie server19 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj3 11.10.15.191:80 cookie serverkj3 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj4 11.10.15.196:80 cookie serverkj4 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj5 11.10.15.194:80 cookie serverkj5 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj6 11.10.15.197:80 cookie serverkj6 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj7 11.10.15.192:80 cookie serverkj7 weight 10 maxconn 2000 check port 8081 server EWN-wwwkj8 11.10.15.193:80 cookie serverkj8 weight 10 maxconn 2000 check port 8081 stats enable stats auth (user):(pass) Backends then follow but i have omitted as those are only used when we are trying to test an individual server
Re: Help with http ACL
Hi Sean, On Fri, Jan 06, 2012 at 02:16:44PM -0500, Sean Patronis wrote: Well, I think I figured it out. Though I am not sure it is the most efficient way. first I created a match acl in the frontend: acl is_apps_match url_dir apps then in the backend, i created a rewrite: reqrep ^([^\ ]*)\ /apps/(.*) \1\ /\2 It is exactly the principle. However you have to keep in mind that remapping URLs is almost always the wrong thing to do, because your application's URL will not match the browser's URLs anymore, and while it can work fine at the beginning, in the long term you'll surely regret it. For instance, you'll have to rewrite your Location headers in redirects too. If your application presents some absolute links, you'll have to change them. Once you do that, you can't test your application anymore by directly connecting to it with a browser, you'll have to connect through haproxy, which makes developments more cumbersome. It's important never to forget this rule : rewrite rules always implies more rewrite rules and you can never get out of that spiral. Is there a more efficient way? The most efficient way is simply not to transform them and have your application server rely on the host and path since they're left untouched. However concerning the frontend rules, you can factor them out : acl redir-app1 url_beg /app1 acl redir-app2 url_beg /app2 acl redir-app3 url_beg /app3 use_backend apps_cluster if redir-app1 use_backend apps_cluster if redir-app2 use_backend apps_cluster if redir-app3 May become (1) : acl redir-app url_beg /app1 acl redir-app url_beg /app2 acl redir-app url_beg /app3 use_backend apps_cluster if redir-app Then (2) : acl redir-app url_beg /app1 /app2 /app3 use_backend apps_cluster if redir-app And then (3) : use_backend apps_cluster if { url_beg /app1 /app2 /app3 } The third form is only handy for short URLs when you have many different app clusters, because it makes the switching rules more readable. In your situation, if you have only one app cluster, the second form is probably much better, because you can have for instance one acl line per hosted application with its various possible URLs on the same line. Regards, Willy
Re: Help with http ACL
Well, I think I figured it out. Though I am not sure it is the most efficient way. first I created a match acl in the frontend: acl is_apps_match url_dir apps then in the backend, i created a rewrite: reqrep ^([^\ ]*)\ /apps/(.*) \1\ /\2 Is there a more efficient way? --Sean On 01/06/2012 11:43 AM, Sean Patronis wrote: I would like to create an ACL for one of my balance nodes and would like to know the best way to implement it. Here is a current frontend I have configured with some sample ACLs. I would like to change these ACLs and condense them if possible frontend test.domain.com 10.0.0.10:80 option http-server-close acl redir-app1 url_beg /app1 acl redir-app2 url_beg /app2 acl redir-app3 url_beg /app3 use_backend apps_cluster if redir-app1 use_backend apps_cluster if redir-app2 use_backend apps_cluster if redir-app3 default_backend default.domain.com Currently, requests come into test.domain.com. If they are going to test.domain.com/app1 (or /apps2, or /app3), they get redirected to our apps server cluster (which will be apps.domain.com/app1). All other requests just fall out to the default.domain.com backend. What I would like to do is have a requests that come into test.domain.com/apps/app1 be redirected to the apps_cluster backend via apps.domain.com/app1. The idea behind this is that I will not need to create a new ACL for every app in our application cluster. Any requests for test.domain.com/app/AnyAppHere will be proxied to apps.domain.com/AnyAppHere, basically stripping the apps from the URL before passing it. What would be the best way to compose this ACL? Thanks for the help. --Sean
Re: Help with reqirep
Solved! Here it is: backend boappsrv mode http option forwardfor option httpclose reqirep ^([^\ ]*)\ /bologna/geamappa(.*) \1\ \2 server bo-appsrv bo-appsrv4-bo.arpa.emr.net:8080 maxconn 50 TY All, Rune
Re: Help with SSL
Hi Christophe, On 03.11.2011 22:00, Christophe Rahier wrote: Hello, My config of HAProxy is: -- CUT -- [snipp] -- CUT -- The problem with SSL is that the IP address that I get to the web server is the IP address of the loadbalancer and not the original IP address. This is a big problem for me and it's essential that I can have the right IP address. How can I do, is it possible? I've heard of stunnel but I don't understand how to use it. Thank you in advance for your help, you must use http://www.stunnel.org/static/stunnel.html protocol = proxy in stunnel and use 'accept-proxy' in haproxy http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.tx= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l1580 as bind option http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.tx= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l1453 and the 'option forwardfor' http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.tx= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l3111 haproxy fill automatically the client ip into X-Forwarded-For header field. I assume this from the doc. Please can you tell us if this is right? Hth Aleks PS: do you have received my answer on the stunnel list?
Re: Help with SSL
Hi Aleks, Thanks for your help, I received your answer yesterday but it was too late for answering, I was too tired :-) I'll check what you proposed. Thanks once again, Christophe Le 04/11/11 09:41, « Aleksandar Lazic » al-hapr...@none.at a écrit : Hi Christophe, On 03.11.2011 22:00, Christophe Rahier wrote: Hello, My config of HAProxy is: -- CUT -- [snipp] -- CUT -- The problem with SSL is that the IP address that I get to the web server is the IP address of the loadbalancer and not the original IP address. This is a big problem for me and it's essential that I can have the right IP address. How can I do, is it possible? I've heard of stunnel but I don't understand how to use it. Thank you in advance for your help, you must use http://www.stunnel.org/static/stunnel.html protocol = proxy in stunnel and use 'accept-proxy' in haproxy http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.t x= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l1580 as bind option http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.t x= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l1453 and the 'option forwardfor' http://haproxy.1wt.eu/git?p=3Dhaproxy.git;a=3Dblob;f=3Ddoc/configuration.t x= t;h=3D8aeeb272d0aeca7477bbb634b52181121122b865;hb=3DHEAD#l3111 haproxy fill automatically the client ip into X-Forwarded-For header field. I assume this from the doc. Please can you tell us if this is right? Hth Aleks PS: do you have received my answer on the stunnel list?
Re: Help with SSL
On Fri, 04 Nov 2011 09:41:00 +0100, Aleksandar Lazic wrote: you must use http://www.stunnel.org/static/stunnel.html protocol = proxy In this case, you need the latest stunnel (4.45).
Re: Help with SSL
Hi Christophe, Use the HAProxy box in transparent mode: HAProxy will get connected to your application server using the client IP. In your backend, just add the line: source 0.0.0.0 usesrc clientip Bear in mind that in such configuration, the default gateway of your server must be the HAProxy box. Or you have to configure PBR on your network. Stunnel can be used in front of HAProxy to uncrypt the traffic. But if your main issue is to get the client IP, then it won't help you unless you setup transparent mode as explained above. cheers On Thu, Nov 3, 2011 at 10:00 PM, Christophe Rahier christo...@qualifio.com wrote: Hello, My config of HAProxy is: -- CUT -- global log 192.168.0.2 local0 log 127.0.0.1 local1 notice maxconn 10240 defaults log global option dontlognull retries 2 timeout client 35s timeout server 90s timeout connect 5s timeout http-keep-alive 10s listen WebPlayer-Farm 192.168.0.2:80 mode http option httplog balance source #balance leastconn option forwardfor stats enable option http-server-close server Player4 192.168.0.13:80 check server Player3 192.168.0.12:80 check server Player1 192.168.0.10:80 check server Player2 192.168.0.11:80 check server Player5 192.168.0.14:80 check option httpchk HEAD /checkCF.cfm HTTP/1.0 listen WebPlayer-Farm-SSL 192.168.0.2:443 mode tcp option ssl-hello-chk balance source server Player4 192.168.0.13:443 check server Player3 192.168.0.12:443 check server Player1 192.168.0.10:443 check server Player2 192.168.0.11:443 check server Player5 192.168.0.14:443 check listen Manager-Farm 192.168.0.2:81 mode http option httplog balance source option forwardfor stats enable option http-server-close server Manager1 192.168.0.60:80 check server Manager2 192.168.0.61:80 check server Manager3 192.168.0.62:80 check option httpchk HEAD /checkCF.cfm HTTP/1.0 listen Manager-Farm-SSL 192.168.0.2:444 mode tcp option ssl-hello-chk balance source server Manager1 192.168.0.60:443 check server Manager2 192.168.0.61:443 check server Manager3 192.168.0.62:443 check listen info 192.168.0.2:90 mode http balance source stats uri / -- CUT -- The problem with SSL is that the IP address that I get to the web server is the IP address of the loadbalancer and not the original IP address. This is a big problem for me and it's essential that I can have the right IP address. How can I do, is it possible? I've heard of stunnel but I don't understand how to use it. Thank you in advance for your help, Christophe
Re: help with tcp-request content track-sc1
On Sat, Aug 27, 2011 at 5:26 AM, Willy Tarreau w...@1wt.eu wrote: Hi David, On Thu, Aug 25, 2011 at 12:28:43PM -0700, David Birdsong wrote: I've poured over 1.5 docs, and I'm pretty sure this should be possible. Is there a way to extract a header string from an http header and track that in a stick-table of type 'string'? If so, what is the syntax, where does the extraction take place? Right now it's not implemented, as the track-sc1 statement is only available at the TCP stage. I'm clearly thinking about having it before 1.5 is released, because at many places it's much more important than the source IP itself. Ok, thanks for the clarification. Is there a way to cast a header as an ip and track-sc1? In our setup we're terminating SSL in front of haproxy and so only the XFF header has the client ip address. Also, is there any way to concatenate two headers into one string value to track and store? If not, I can concatenate them upstream (close to client), but it'd be nice to keep the logic local to haproxy's config. No this is not possible. We need the pattern extraction feature which has not even started yet for this :-( Regards, Willy
Re: help with tcp-request content track-sc1
On Mon, Aug 29, 2011 at 01:40:53PM -0700, David Birdsong wrote: On Mon, Aug 29, 2011 at 1:36 PM, Willy Tarreau w...@1wt.eu wrote: On Mon, Aug 29, 2011 at 12:22:18PM -0700, David Birdsong wrote: On Sat, Aug 27, 2011 at 5:26 AM, Willy Tarreau w...@1wt.eu wrote: Hi David, On Thu, Aug 25, 2011 at 12:28:43PM -0700, David Birdsong wrote: I've poured over 1.5 docs, and I'm pretty sure this should be possible. Is there a way to extract a header string from an http header and track that in a stick-table of type 'string'? If so, what is the syntax, where does the extraction take place? Right now it's not implemented, as the track-sc1 statement is only available at the TCP stage. I'm clearly thinking about having it before 1.5 is released, because at many places it's much more important than the source IP itself. Ok, thanks for the clarification. Is there a way to cast a header as an ip and track-sc1? In our setup we're terminating SSL in front of haproxy and so only the XFF header has the client ip address. I understand the issue, it's the same everyone is facing when trying to do the same thing unfortunately :-( If you use a patched stunnel version which supports the PROXY protocol, then you can have the client's IP available as soon as tcp-request content rules are processed. Those rules support track-sc1 so you can do what you want at this level. It requires a patch on stunnel however, but it should not be an issue since you appear to be using the XFF We're actually terminating half of our ssl traffic with nginx and the other half with Amazon's elb offering with plans of moving all ssl termination to Amazon in the next week or so. The PROXY protocol should be ported to Amazon's ELB then ;-) Cheers, Willy
Re: help with tcp-request content track-sc1
On Mon, Aug 29, 2011 at 1:46 PM, Willy Tarreau w...@1wt.eu wrote: On Mon, Aug 29, 2011 at 01:40:53PM -0700, David Birdsong wrote: On Mon, Aug 29, 2011 at 1:36 PM, Willy Tarreau w...@1wt.eu wrote: On Mon, Aug 29, 2011 at 12:22:18PM -0700, David Birdsong wrote: On Sat, Aug 27, 2011 at 5:26 AM, Willy Tarreau w...@1wt.eu wrote: Hi David, On Thu, Aug 25, 2011 at 12:28:43PM -0700, David Birdsong wrote: I've poured over 1.5 docs, and I'm pretty sure this should be possible. Is there a way to extract a header string from an http header and track that in a stick-table of type 'string'? If so, what is the syntax, where does the extraction take place? Right now it's not implemented, as the track-sc1 statement is only available at the TCP stage. I'm clearly thinking about having it before 1.5 is released, because at many places it's much more important than the source IP itself. Ok, thanks for the clarification. Is there a way to cast a header as an ip and track-sc1? In our setup we're terminating SSL in front of haproxy and so only the XFF header has the client ip address. I understand the issue, it's the same everyone is facing when trying to do the same thing unfortunately :-( If you use a patched stunnel version which supports the PROXY protocol, then you can have the client's IP available as soon as tcp-request content rules are processed. Those rules support track-sc1 so you can do what you want at this level. It requires a patch on stunnel however, but it should not be an issue since you appear to be using the XFF We're actually terminating half of our ssl traffic with nginx and the other half with Amazon's elb offering with plans of moving all ssl termination to Amazon in the next week or so. The PROXY protocol should be ported to Amazon's ELB then ;-) Agreed, that would be a big help. Anybody know what the ELB's are? Some have speculated their just using Netscalers, but they toss around the word 'instance' in a way that makes me wonder if they're just using ec2 instances. Cheers, Willy
Re: help with tcp-request content track-sc1
Hi David, On Thu, Aug 25, 2011 at 12:28:43PM -0700, David Birdsong wrote: I've poured over 1.5 docs, and I'm pretty sure this should be possible. Is there a way to extract a header string from an http header and track that in a stick-table of type 'string'? If so, what is the syntax, where does the extraction take place? Right now it's not implemented, as the track-sc1 statement is only available at the TCP stage. I'm clearly thinking about having it before 1.5 is released, because at many places it's much more important than the source IP itself. Also, is there any way to concatenate two headers into one string value to track and store? If not, I can concatenate them upstream (close to client), but it'd be nice to keep the logic local to haproxy's config. No this is not possible. We need the pattern extraction feature which has not even started yet for this :-( Regards, Willy
Re: Help with sticky in cookies and occasional incorrect node
On Mon, Jun 20, 2011 at 04:48:15PM +1200, Todd Nine wrote: Hi guys, We're experiencing a strange issue I could use a hand with. We require sticky sessions in our app. I was using the following configuration in my haproxy conf https://gist.github.com/0e8dba64b2008473c408 Occasionally, I'm seeing this problem. 1. Client makes first request 2. Client is connected to node, say test-app-west-1 3. Client response is received, client executes jquery on document load json calls 4. The client receives the cookie value of test-app-west-3 in it's json response, but test-app-west-1 was sent in the request. I'm on Ubuntu 10.10 server 64 bit with HA-Proxy version 1.4.8 2010/06/16. Any ideas what could be causing this issue? If you're seeing that a different cookie was sent in the response than the one in the request, it means the server went down and the request had to be redispatched. This should be clearly visible in the logs. You will see one request with flags NI-- going to test-app-west-1, then probably a few requests with flags VN-- going to it, then one request with either DI (server down) or VI (server was not yet known as down but connection failed and was redispatched). Regards, Willy
Re: Help on SSL termination and balance source
On Thu, Jun 9, 2011 at 7:33 AM, habeeb rahman pk.h...@gmail.com wrote: apache rewrite rule: RewriteRule ^/(.*)$ http://127.0.0.1:2443%{REQUEST_URI} [P,QSA,L] Why are you using a rewrite instead of mod_proxy? ProxyPass does some nice things by default, like adding the X-Forwarded-For header which will provide the address of the client. Otherwise, you will need to do this manually with rewrite rules. -jim
Re: Help on SSL termination and balance source
James, Thanks for your points. Rewrite rule was set up by some other guys and is being used for some time now and works well with round robin. Anyhow I will look at mod_proxy in detail. Not sure how SSL termination can be done with it and moreover how haproxy gonna balance based on client IP. Any insight? Anyone else has any thoughts or insights to share? -Habeeb On Thu, Jun 9, 2011 at 7:11 PM, James Bardin jbar...@bu.edu wrote: On Thu, Jun 9, 2011 at 7:33 AM, habeeb rahman pk.h...@gmail.com wrote: apache rewrite rule: RewriteRule ^/(.*)$ http://127.0.0.1:2443%{REQUEST_URI} [P,QSA,L] Why are you using a rewrite instead of mod_proxy? ProxyPass does some nice things by default, like adding the X-Forwarded-For header which will provide the address of the client. Otherwise, you will need to do this manually with rewrite rules. -jim
Re: Help on SSL termination and balance source
Habeeb, given your Apache does actually insert/append an X-Forwarded-For header you can use this statement instead of balance source in HAProxy: balance hdr(X-Forwarded-For) This has a few caveats you should be aware. Users can set the X-Forwarded-Header themselves (which is done by some upstream proxies). Most forwarders (HAProxy included) just append their IP to the list by default. I don't know how Apache can be configured, but you should try to delete and upstream X-Forwarded-For headers and just include the IP of the last visible source to avoid users messing with the balancing. Hope that helps, Holger On 09.06.2011 15:54, habeeb rahman wrote: James, Thanks for your points. Rewrite rule was set up by some other guys and is being used for some time now and works well with round robin. Anyhow I will look at mod_proxy in detail. Not sure how SSL termination can be done with it and moreover how haproxy gonna balance based on client IP. Any insight? Anyone else has any thoughts or insights to share? -Habeeb On Thu, Jun 9, 2011 at 7:11 PM, James Bardin jbar...@bu.edu mailto:jbar...@bu.edu wrote: On Thu, Jun 9, 2011 at 7:33 AM, habeeb rahman pk.h...@gmail.com mailto:pk.h...@gmail.com wrote: apache rewrite rule: RewriteRule ^/(.*)$ http://127.0.0.1:2443%{REQUEST_URI} [P,QSA,L] Why are you using a rewrite instead of mod_proxy? ProxyPass does some nice things by default, like adding the X-Forwarded-For header which will provide the address of the client. Otherwise, you will need to do this manually with rewrite rules. -jim
Re: Help me please, with haproxy.cfg for FTP Server.
Le samedi 28 mai 2011 08:05:59, Jirapong Kijkiat a écrit : Dear. w...@1wt.eu, haproxy@formilux.org How i can config haproxy for load balance my ftp server. now my haproxy.cnf FTP is not easy to load balance. Here is the solution I use. 1. HAProxy machine is the NAT gateway for FTP servers. 2. HAProxy load balances only the control connection (port 21). The hard part is the data connection. The FTP protocol works by opening a control channel which exchanges commands and responses. Whenever data needs to be transfered another connection (a data channel) is opened. Files, directory listings and similar bulk data is transfer over the data channel. In this way, FTP allows simultaneous transfer of multiple files. Rather than multiplex channels on a single connection, FTP uses a connection per channel. The data channel works in two modes. 1. Active mode (the default) means that the server will connect to the client. When a data channel is needed, the client and server negotiate a TCP address and port for the server to connect to the client on. The client opens this port and awaits the connection. Usually NAT routers and firewall on the client end rely on packet inspection to observe this negotiation, they then allow this connection to take place. Often they will modify the negotiation to inject the public IP address in place of the private (RFC 1918) address of the client. The exception to this when SSL is used. SSL prevents packet inspection and breaks active mode. 2. Passive mode means that the client will open an additional connection to the server. Generally this works better as the FTP server admin can open the port range that will be used for passive connections. Most NAT routers and firewalls allow any outbound traffic, so they will not stand in the way of a passive connection. This allows connections to work without packet inspection even with SSL. So, once HAProxy is load balancing the control channel, you have to work out how to allow both active and passive connections to work. -- Allowing active mode to work -- 1. You must SNAT the FTP server's (private) address to the same address that accepted the control channel connection (HAProxy bind address). Otherwise the client machine will sometimes balk at a connection from an address other than the server's (the one it opened the command channel to). Also, without this SNAT rule in place, any NAT router or firewall will expect the connection to come from the server, and will block it if it does not. -- Allowing passive mode to work -- 1. You must allocate a unique port range for each backend FTP server, and DNAT each range to the various servers. You must also configure each server to use it's own unique port space for passive connections. Most FTP servers allow you to specify the passive port range. If you are using proftpd, here is how you configure the passive port range. http://www.proftpd.org/docs/directives/linked/config_ref_PassivePorts.html Example: DNAT rule/passive range - backend server. 2048-4096 - Server A. 4097-6145 - Server B. This way, any client connected to server A will connect to it's dedicated passive port range and be forwarded by NAT to the correct backend server (which is awaiting it's connection). 2. You must also configure the FTP server to masquerade as the same address used for making the control connection (the IP address HAProxy is listening on port 21 on). This is so that passive connections hit the NAT server and are correctly forwarded. Bypassing NAT by directing the client to connect to the backend server directly does not work in all FTP clients, so it is best to simply masquerade as the main FTP service IP address. Most FTP servers allow you to configure a masquerade or public IP address to use in passive connection negotiations with clients. If you are using proftpd, here is how you configure the masquerade address: http://www.proftpd.org/docs/directives/linked/config_ref_MasqueradeAddress.html -- Client IP address -- * At this point you have a working setup, the next section is about fine-tuning it. I would get to this point before tackling the next steps... The last issue is that now FTP works great, but the FTP server sees all connections coming from the proxy machine's IP address instead of the client's address. To solve this you have two options. 1. Use TPROXY kernel support to perform transparent proxying. 2. Use the PROXY protocol and write a plugin for your FTP server to accept the PROXY protocol. http://haproxy.1wt.eu/download/1.5/doc/proxy-protocol.txt I personally use open 2 as I prefer a user-space solution to a kernel solution. Also, it is much easier to set up my FTP servers without a custom kernel package that I have to maintain (instead of a simple yum update), let upstream do that for you.
Re: Help with high concurrency and throughput configuration
On Mon, Dec 27, 2010 at 1:25 AM, Willy Tarreau w...@1wt.eu wrote: Hi Joubert, On Thu, Dec 23, 2010 at 03:29:34PM -0500, Joubert Berger wrote: Hi Cyril, On Wed, Dec 22, 2010 at 7:33 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi Joubert, Le mercredi 22 décembre 2010 22:11:27, Joubert Berger a écrit : (...) For the proxy I am comparing squid (as a reverse proxy) and haproxy. For squid, the only major thing I added was: max_filedesc 32768 cache_dir null /tmp http_port 80 accel defaultsite=server cache_peer 192.168.0.120 parent 80 0 no-query originserver (???) And now to the results: And a reminder, I am measuring high concurrency and large transfers. I am using Apache Benchmark tool (ab). So, I would do something like: ab -n 4000 -c 4000 http://192.168.0.115/2M.file Here are some results: At 800 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 53 req/sec and throughput of 875 Mbps haproxy: 49 req/sec and throughput of 782 Mbps At 4000 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 41 req/sec and throughput of 664 Mbps haproxy: 30 req/sec and throughput of 474 Mbps I have more data, but for now it shows haproxy not comparing well to squid. Anyone have any insight what is going on? Have you verified the number of requests received on your nginx backend for each tests ? I wonder if you don't still have cache enabled in squid. I verified that my configuration did have cache deny all. Do you also have this directive in your squid configuration ? cache deny all Or you could retry with -H Cache-Control: no-cache, must-revalidate in your ab command line. I will give that a try, but I don't think it is a caching issue. Given the numbers I don't think it's a caching issue either. However, could you check that your haproxy is correctly built to use epoll (haproxy -vv) ? Have you checked CPU usage in both configurations ? If haproxy is running with the legacy poll() or select(), it could very well be running close to 100% system with 4000 connections. 4000 concurrent connections is not much, some sites are regularly running above 3 with one reporting more than 15. And at least one of them recently reported 5 Gbps of production traffic, reason why I'm finding your results quite low. In fact, even in my 10gig tests, I achieve around 570 requests per second on 2M objects. Also, I don't know if this is meant as a pure benchmark or as a test for future production plans, but 1 Gbps to serve 4000 connections means 250 kbps on average per connection. While this is more than enough for normal sites (lots of small objects), you should keep in mind that if your average object size will be 2M, then your link will be the limiting factor, causing the average download time to reach one minute for such objects, even if the visitor have more than 250 kbps of connectivity. Depending on how you use your benchmarks, it may be something to keep in mind for the site's scalability. Cheers, Willy Sorry for the delayed reply. I did solve the error problem by removing the switch. I now have the three machines connected directly to themselves. Just to follow up on a question you asked: Here is my -vv. HA-Proxy version 1.4.10 2010/11/28 Copyright 2000-2010 Willy Tarreau w...@1wt.eu Build options : TARGET = linux26 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing OPTIONS = Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll. Could it be that I don't have enough horsepower? cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 6 model name : Intel(R) Pentium(R) 4 CPU 3.60GHz stepping: 2 cpu MHz : 3591.197 cache size : 2048 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 1 apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 6 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr lahf_lm bogomips: 7182.39 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 6 model name : Intel(R) Pentium(R) 4 CPU 3.60GHz stepping: 2 cpu MHz : 3591.197 cache size : 2048 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 1
Re: Help with high concurrency and throughput configuration
On Fri, Jan 07, 2011 at 10:37:56AM -0500, Joubert Berger wrote: I did solve the error problem by removing the switch. I now have the three machines connected directly to themselves. OK fine. Just to follow up on a question you asked: Here is my -vv. HA-Proxy version 1.4.10 2010/11/28 Copyright 2000-2010 Willy Tarreau w...@1wt.eu Build options : TARGET = linux26 CPU = generic CC = gcc CFLAGS = -O2 -g -fno-strict-aliasing OPTIONS = Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll. OK this one is OK. Could it be that I don't have enough horsepower? I don't think so, because a 3.6 GHz P4 is already a correct beast. I remember I got 1.1 Gbps out of a 3.2 Gbps one some time ago with version 1.2 (did not even have splicing). However, hyperthreading was very very bad for network performance when it was introduced on P4, you might want to retry without it. However, could you recheck that the CPU is not hitting the roof on the machine running ab ? I'm asking because here, ab refuses to run at more than 1020 concurrent connections (despite the ulimit -n), and it's saturating the CPU doing small read/writes. I'm noticing that its performance is simply halved between 100 and 1000 concurrent connections and it does not even run smoothly. That does not explain why there would be a difference between the components you're testing, but the buffer sizes can have a significant impact on the way packets are grouped and delivered. Here for instance, I'm running my injection tools on a Core i5 3.06 GHz and haproxy on a Core i5 at 3.6 GHz. The machines are linked by a dual 10gig myri10ge NIC. The test requests 2M objects (as in your case). 500 of them are fetched every second, resulting in a stable 8.7 Gbps bitrate. Both cores of the injector are 100% CPU and haproxy runs at 70% of one core on the other host. If I change a few settings on the haproxy machine, it immediately impacts the bitrate despite the fact that it's not using a full CPU and the other ones were already saturated (eg if I disable LRO, etc...). You may want to try to rebuild haproxy with support for TCP splicing (USE_LINUX_SPLICE=1) and also try to force it use larger buffers (eg: tune.bufsize=65536 in the global section). Warning, at 4k concurrent connections, this will eat 512 MB of memory. Maybe you'll spot important differences, I don't know. Interrupt moderation is important with some NICs too, so please check with vmstat 1 that you don't have too many interrupts per second (higher than 2 becomes high for your machine). Regards, Willy
Re: Help with high concurrency and throughput configuration
Hi Joubert, On Thu, Dec 23, 2010 at 03:29:34PM -0500, Joubert Berger wrote: Hi Cyril, On Wed, Dec 22, 2010 at 7:33 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi Joubert, Le mercredi 22 décembre 2010 22:11:27, Joubert Berger a écrit : (...) For the proxy I am comparing squid (as a reverse proxy) and haproxy. For squid, the only major thing I added was: max_filedesc 32768 cache_dir null /tmp http_port 80 accel defaultsite=server cache_peer 192.168.0.120 parent 80 0 no-query originserver (???) And now to the results: And a reminder, I am measuring high concurrency and large transfers. I am using Apache Benchmark tool (ab). So, I would do something like: ab -n 4000 -c 4000 http://192.168.0.115/2M.file Here are some results: At 800 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 53 req/sec and throughput of 875 Mbps haproxy: 49 req/sec and throughput of 782 Mbps At 4000 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 41 req/sec and throughput of 664 Mbps haproxy: 30 req/sec and throughput of 474 Mbps I have more data, but for now it shows haproxy not comparing well to squid. Anyone have any insight what is going on? Have you verified the number of requests received on your nginx backend for each tests ? I wonder if you don't still have cache enabled in squid. I verified that my configuration did have cache deny all. Do you also have this directive in your squid configuration ? cache deny all Or you could retry with -H Cache-Control: no-cache, must-revalidate in your ab command line. I will give that a try, but I don't think it is a caching issue. Given the numbers I don't think it's a caching issue either. However, could you check that your haproxy is correctly built to use epoll (haproxy -vv) ? Have you checked CPU usage in both configurations ? If haproxy is running with the legacy poll() or select(), it could very well be running close to 100% system with 4000 connections. 4000 concurrent connections is not much, some sites are regularly running above 3 with one reporting more than 15. And at least one of them recently reported 5 Gbps of production traffic, reason why I'm finding your results quite low. In fact, even in my 10gig tests, I achieve around 570 requests per second on 2M objects. Also, I don't know if this is meant as a pure benchmark or as a test for future production plans, but 1 Gbps to serve 4000 connections means 250 kbps on average per connection. While this is more than enough for normal sites (lots of small objects), you should keep in mind that if your average object size will be 2M, then your link will be the limiting factor, causing the average download time to reach one minute for such objects, even if the visitor have more than 250 kbps of connectivity. Depending on how you use your benchmarks, it may be something to keep in mind for the site's scalability. Cheers, Willy
Re: Help with high concurrency and throughput configuration
Hi Cyril, On Wed, Dec 22, 2010 at 7:33 PM, Cyril Bonté cyril.bo...@free.fr wrote: Hi Joubert, Le mercredi 22 décembre 2010 22:11:27, Joubert Berger a écrit : (...) For the proxy I am comparing squid (as a reverse proxy) and haproxy. For squid, the only major thing I added was: max_filedesc 32768 cache_dir null /tmp http_port 80 accel defaultsite=server cache_peer 192.168.0.120 parent 80 0 no-query originserver (???) And now to the results: And a reminder, I am measuring high concurrency and large transfers. I am using Apache Benchmark tool (ab). So, I would do something like: ab -n 4000 -c 4000 http://192.168.0.115/2M.file Here are some results: At 800 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 53 req/sec and throughput of 875 Mbps haproxy: 49 req/sec and throughput of 782 Mbps At 4000 concurrent connections: web server (no proxy): 55 req/sec and throughput of 890 Mbps squid: 41 req/sec and throughput of 664 Mbps haproxy: 30 req/sec and throughput of 474 Mbps I have more data, but for now it shows haproxy not comparing well to squid. Anyone have any insight what is going on? Have you verified the number of requests received on your nginx backend for each tests ? I wonder if you don't still have cache enabled in squid. I verified that my configuration did have cache deny all. Do you also have this directive in your squid configuration ? cache deny all Or you could retry with -H Cache-Control: no-cache, must-revalidate in your ab command line. I will give that a try, but I don't think it is a caching issue. --joubert
Re: help with halog
On Wed, Jun 9, 2010 at 10:09 PM, Willy Tarreau w...@1wt.eu wrote: Hi David, On Wed, Jun 09, 2010 at 04:37:28PM -0700, David Birdsong wrote: I'm pretty excited to start using halog, but dumping out the usage is about the only documentation I can turn up -which is not explaining anything to me. Is there anything more substantial on how to use halog? you're right. At the beginning, it was just a tool to help me spot production issues, then I have added features and explained a few people how to use it. But obviously some doc is missing. I'll be quick here but I hope it will help you to start. First, you should see it as a haproxy-specific grep with a few enhanced filters and outputs. It can only do one output format at a time, but you can combine several input filters. Input filters : -e : only consider lines which don't report an error (timeout, connect, 5xx, ...) -E : only consider lines which do report an error (timeout, connect, 5xx, ...) -rt XXX : only consider lines with server response times higher than XXX ms -RT XXX : only consider lines with server response times lower than XXX ms -ad XXX : only consider lines which indicate an accept time after a silence of XXX ms -ac XXX : to be used with -ad, only consider those lines if at least XXX lines are grouped after the silence. -v : invert the selection Some filters are incompatible. You can have only one of -e and -E, and you can have only one of -rt and -RT. Since some syslogs add a field for the sender's host and others don't, you can adjust the fields offsets with -s. By default, -s 1 is assumed, to skip one field for the origin host. You can use -s 0 if your syslog does not add it (or if you use netcat to log). Or you can use -s 2 if your syslog adds other fields. Negative values are also permitted if that help. The output format can be selected with the following flags : -q : don't show a warning for unparsable lines (eg: server XXX is UP) -c : only report the number of lines which match -gt : outputs a list of x,y values to be used with gnuplot to visually check if everything's OK. It was its first use, but it's not used anymore, as it was not very convenient to export values. -pct: report a percentile table of request time, connect time, response time, data time. The output contains the percent and absolute number of requests served in less than XXX ms for each field. It's very helpful to quickly spot TCP retransmits because you can see if you have large 3 seconds steps. Also, it is convenient to use on prod when you suspect a site is slow. Just a quick check and you can tell if your timers are slower than other days. -st : report the distribution of the status codes (200, 302, ...). Again, this is meant as a quick help. You run that when you suspect an issue and you immediately see if some files are missing (404) or some errors are reported. -srv: enumerate all servers found in the logs with their respective status codes distribution (2xx, 3xx, 4xx, 5xx), the number of errors (-1 anywhere in a timer), the error ratio, the average response time (without data) and the average connect time. -ad and -ac provide a special output. I don't remember the format, they were developped to track an issue with huge packet losses, I seem to remember they only report the time of the accept of requests matching the criteria, the length of the silence as well as the number of requests accepted at once. The goal was to find abnormally long silences. For instance, if you have a load between 500 and 2000 hits/s 24h a day, you're almost certain that a one second silence indicates an issue. Being able to spot the end of silences and compare them on several machines helps find the origin of the trouble (switch, machine swapping, etc...) wow, thanks for the run-down. there's a lot here; plenty to get me started. In practice, you generally just want to run -st when you think you may be encountering a trouble. If you see an abnormal error distribution, then you'll rerun with -srv to find what server is the culprit (if any). I know some people who run that continuously coupled with a tail -5000. That way they get a realtime stats distribution for their servers. thanks, -srv i think is what i've been hoping for to track down bad backends in a backend section that has roughly 400 servers. The percentile output is more to be used on full day logs, it helps check how heavy days compare with calm ones in terms of response times. But it can be used by prod people to quickly check if there are any errors. At least from what I have observed, sometimes people are not sure about the fields, but they're quite sure that two outputs don't look similar and
Re: help with halog
Hi David, On Wed, Jun 09, 2010 at 04:37:28PM -0700, David Birdsong wrote: I'm pretty excited to start using halog, but dumping out the usage is about the only documentation I can turn up -which is not explaining anything to me. Is there anything more substantial on how to use halog? you're right. At the beginning, it was just a tool to help me spot production issues, then I have added features and explained a few people how to use it. But obviously some doc is missing. I'll be quick here but I hope it will help you to start. First, you should see it as a haproxy-specific grep with a few enhanced filters and outputs. It can only do one output format at a time, but you can combine several input filters. Input filters : -e : only consider lines which don't report an error (timeout, connect, 5xx, ...) -E : only consider lines which do report an error (timeout, connect, 5xx, ...) -rt XXX : only consider lines with server response times higher than XXX ms -RT XXX : only consider lines with server response times lower than XXX ms -ad XXX : only consider lines which indicate an accept time after a silence of XXX ms -ac XXX : to be used with -ad, only consider those lines if at least XXX lines are grouped after the silence. -v : invert the selection Some filters are incompatible. You can have only one of -e and -E, and you can have only one of -rt and -RT. Since some syslogs add a field for the sender's host and others don't, you can adjust the fields offsets with -s. By default, -s 1 is assumed, to skip one field for the origin host. You can use -s 0 if your syslog does not add it (or if you use netcat to log). Or you can use -s 2 if your syslog adds other fields. Negative values are also permitted if that help. The output format can be selected with the following flags : -q : don't show a warning for unparsable lines (eg: server XXX is UP) -c : only report the number of lines which match -gt : outputs a list of x,y values to be used with gnuplot to visually check if everything's OK. It was its first use, but it's not used anymore, as it was not very convenient to export values. -pct: report a percentile table of request time, connect time, response time, data time. The output contains the percent and absolute number of requests served in less than XXX ms for each field. It's very helpful to quickly spot TCP retransmits because you can see if you have large 3 seconds steps. Also, it is convenient to use on prod when you suspect a site is slow. Just a quick check and you can tell if your timers are slower than other days. -st : report the distribution of the status codes (200, 302, ...). Again, this is meant as a quick help. You run that when you suspect an issue and you immediately see if some files are missing (404) or some errors are reported. -srv: enumerate all servers found in the logs with their respective status codes distribution (2xx, 3xx, 4xx, 5xx), the number of errors (-1 anywhere in a timer), the error ratio, the average response time (without data) and the average connect time. -ad and -ac provide a special output. I don't remember the format, they were developped to track an issue with huge packet losses, I seem to remember they only report the time of the accept of requests matching the criteria, the length of the silence as well as the number of requests accepted at once. The goal was to find abnormally long silences. For instance, if you have a load between 500 and 2000 hits/s 24h a day, you're almost certain that a one second silence indicates an issue. Being able to spot the end of silences and compare them on several machines helps find the origin of the trouble (switch, machine swapping, etc...) In practice, you generally just want to run -st when you think you may be encountering a trouble. If you see an abnormal error distribution, then you'll rerun with -srv to find what server is the culprit (if any). I know some people who run that continuously coupled with a tail -5000. That way they get a realtime stats distribution for their servers. The percentile output is more to be used on full day logs, it helps check how heavy days compare with calm ones in terms of response times. But it can be used by prod people to quickly check if there are any errors. At least from what I have observed, sometimes people are not sure about the fields, but they're quite sure that two outputs don't look similar and that one of them indicates a problem. That's already a good thing because they can say in one second everything looks OK to me. Last point, I found that -rt/-RT can be used for debugging, as they help spot abnormally long requests. In this case, you'll end up running the tool several times in a row. I
Re: Help! haproxy+nginx+tomcats cluster problems.
Hi, On Sat, Jan 23, 2010 at 12:05:16PM +0800, ËïéªËÉ wrote: Hello ! I have questions ! please help me ! thank you very much ! my cluster works , but not excellent. that's please see this architecture below my questions first. Q1:on the tomcats ,there are always 500~800 TIME_WAIT connections from haproxy; Time_wait sockets are harmless and normal. 800 is very low and you don't have to worry about them. I have already reached 4 millions :-) Q2:when I use loadrunner to test this cluster , the result disappointed my very much. you have to describe a bit more what happened. (...) global log 127.0.0.1 local0 notice maxconn 40960 chroot /usr/local/haproxy/chroot uid 99 gid 99 nbproc 8 remove nbproc, it can only cause you trouble. pidfile /usr/local/haproxy/logs/haproxy.pid stats socket /var/run/haproxy.stat mode 600 stats maxconn 65536 you don't need to support 64k connections to the stats socket, that's nonsense. Simply remove this statement. ulimit-n 65536 you can safely remove that one too since it's automatically computed from maxconn. daemon #debug #quiet nosepoll You can save some CPU cycles by commenting out nosepoll. defaults log global mode http option httplog please add log global here, otherwise you won't log. retries 3 option redispatch option abortonclose maxconn 40960 Please always ensure that any defaults or listen or frontend's maxconn is lower than the global one. contimeout 5000 clitimeout 3 srvtimeout 3 timeout check 2000 option dontlognull frontend haproxy bind 0.0.0.0:80 mode http option httplog option httpclose option forwardfor maxconn5 same here about maxconn. clitimeout 3 acl statcs url_reg ^[^?]*\.(jpg|html|htm|png|gif|css|shtml)([?]|$) This one can be simplified a lot and without an expensive regex : acl statcs path_end .jpg .html .htm .png .gif .css .shtml OK, your config is otherwise correct. What load did you reach with loadrunner, in terms of requests per second and concurrent connections ? What response time did you observe ? What haproxy version are you running on ? You may have to start a new test with the fixed parameters above and with logging enabled (please check that your syslog daemon correctly logs). Otherwise you won't know where your issues happen. Regards, Willy
Re: Help me please
2009/8/27 Vadim Bazilevich bvv2...@gmail.com: Hi friends! I used haproxy in my project. But I have one problem. What I can switch between two backends servers (me need used rule url_sub) if I used haproxy as frontend Define two backends, backend1 and backend10001 and one frontend. In the frontend section do something like this: use_backend backend10001 if url_sub sms And of course you have to define the two ACLs url_sub and sms, which do not appear in your config. -- Jean-Baptiste Quenot http://jbq.caraldi.com/
Re: Help needed
On 17.06.2009 19:59 Uhr, Karthik Pattabhiraman wrote: We use HAProxy 1.3.17 for our setup. We faced an issue where the requests were redirected to a wrong cluster. We are still not able to figure out why this happened and would really appreciate any help. Please find attached a sample configuration file. In this case when requests were coming to m.nbcsports.com they were redirected to ad_cluster and we are at out wits end trying to figure what is wrong. Karthik, I assume not every request is routed the wrong way but seemingly arbitrary requests. Further I assume that requests to nbc_cluster and ad_cluster happen on the same page (i.e. a browser hits both clusters in a single page view) If this is the case you might try to include the option httpclose directive into your frontend or your default section. As you just included it into into the nbc_cluster backend and not the other ones I think this will cause trouble as it imposes a race condition on which requests will be served first in a connection. Please also read the documentation for option httpclose (and possibly option forceclose) again... --Holger