Re: option httpchk is reporting servers as down when they're not
Hi Thomas, On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: Hi Jeff, The thing is that if I don't include the health check, the load balancer works fine and each server receives equal distribution. I have no idea why the servers would be reported as down but still work when unchecked. It is possible that your servers expect the Host: header to be set during the checks. There's a trick to do it right now (don't forget to escape spaces) : option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com Also, you should check the server's logs to see why it is reporting the service as down. And as a last resort, a tcpdump of the traffic between haproxy and a failed server will show you both the request and the complete error from the server. Regards, Willy
RE: option httpchk is reporting servers as down when they're not
Thanks, once I figure out logging I'll let you guys know what I discover :^) Thomas Allen Web Developer, ASCE 703.295.6355 -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Friday, March 06, 2009 1:39 PM To: Allen, Thomas Cc: Jeffrey 'jf' Lim; haproxy@formilux.org Subject: Re: option httpchk is reporting servers as down when they're not Hi Thomas, On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote: Hi Jeff, The thing is that if I don't include the health check, the load balancer works fine and each server receives equal distribution. I have no idea why the servers would be reported as down but still work when unchecked. It is possible that your servers expect the Host: header to be set during the checks. There's a trick to do it right now (don't forget to escape spaces) : option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com Also, you should check the server's logs to see why it is reporting the service as down. And as a last resort, a tcpdump of the traffic between haproxy and a failed server will show you both the request and the complete error from the server. Regards, Willy
Re: load balancer and HA
On Wed, Mar 04, 2009 at 12:12:21AM +0100, Alexander Staubo wrote: On Tue, Mar 3, 2009 at 11:44 PM, Martin Karbon martin.kar...@asbz.it wrote: just wanted to know if anyone knows an opensource solution for a so called transparent failover: what I mean with that is, I installed two machines with haproxy on it which comunicate with each other via heartbeat. If one fails the other one goes from passive to active but all sessions are lost and users have to reconnect. We use Heartbeat (http://www.keepalived.org/) for this. Heartbeat lets us set up virtual service IPs which are reassigned to another box if the box goes down. Works like a charm. Current connections are lost, but new ones go to the new IP. Note that there are two current versions of Heartbeat. There's the old 1.x series, which is simple and stable, but which has certain limitations such as only supporting two nodes, if I remember correctly. Then there's 2.x, which is much more complex and less stable. We run 2.0.7 today, and we have had some situations where the Heartbeat processes have run wild. It's been running quietly for over a year now, so recent patches may have fixed the issues. I would still recommend sticking with 1.x if at all possible. I still don't understand why people stick to heartbeat for things as simple as moving an IP address. Heartbeat is more of a clustering solution, with abilities to perform complex tasks. When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. I've been told that ucarp was good at that too, though I've never tried it yet. While there are solutions out there that preserve connections on failover, my gut feeling is that they introduce a level of complexity and computational overhead that is necessarily puts a restraint on performance. In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. And that also prevents true randomness from being used at TCP and IP levels. It also causes trouble when some packets are lost between the peers, because they can quickly get out of sync. In practise, in order to synchronise TCP between two hosts, you need more bandwidth than that of the traffic you want to forward. There are intermediate solutions which synchronise at layer 4 only, without taking into account the data nor the sequence numbers. Those present the advantage of being able to take over a connection without too much overhead, but no layer 7 processing can be done there, and those cannot be system sockets. That's typically what you find in some firewalls or layer4 load balancers which just forward packets between two sides and maintain a vague context. Regards, Willy
Re: measuring haproxy performance impact
On Fri, Mar 6, 2009 at 8:43 AM, Willy Tarreau w...@1wt.eu wrote: Hi Michael, On Thu, Mar 05, 2009 at 01:04:06PM -0800, Michael Fortson wrote: I'm trying to understand why our proxied requests have a much greater chance of significant delay than non-proxied requests. The server is an 8-core (dual quad) Intel machine. Making requests directly to the nginx backend is just far more reliable. Here's a shell script output that just continuously requests a blank 0k image file from nginx directly on its own port, and spits out a timestamp if the delay isn't 0 or 1 seconds: Thu Mar 5 12:36:17 PST 2009 beginning continuous test of nginx port 8080 Thu Mar 5 12:38:06 PST 2009 Nginx Time is 2 seconds Here's the same test running through haproxy, simultaneously: Thu Mar 5 12:36:27 PST 2009 beginning continuous test of haproxy port 80 Thu Mar 5 12:39:39 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:39:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:39:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:03 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:45 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:41:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:01 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:29 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:38 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:43:05 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:43:15 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:25 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:30 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:33 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:39 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:46 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:54 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:07 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:16 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:45 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:54 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:05 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:32 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:53 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:47:40 PST 2009 Nginx Time is 3 seconds 3 seconds is typically a TCP retransmit. You have network losses somewhere from/to your haproxy. Would you happen to be running on a gigabit port connected to a 100 Mbps switch ? What type of NIC is this ? I've seen many problems with broadcom netxtreme 2 (bnx2) caused by buggy firmwares, but it seems to work fine for other people after a firmware upgrade. My sanitized haproxy config is here (mongrel backend was omitted for brevity) : http://pastie.org/408729 Are the ACLs just too expensive? Not at all. Especially in your case. To reach 3 seconds of latency, you would need hundreds of thousands of ACLs, so this is clearly unrelated to your config. Nginx is running with 4 processes, and the box shows mostly idle. ... which indicates that you aren't burning CPU cycles processing ACLs ;-) It is also possible that some TCP settings are too low for your load, but I don't know what your load is. Above a few hundreds-thousands of sessions per second, you will need to do some tuning, otherwise you can end up with similar situations. Regards, Willy Hmm. I think it is gigabit connected to 100 Mb (all Dell rack-mount servers and switches). The nginx backend runs on the same machine as haproxy and is referenced via 127.0.0.1 -- does that still involve a real network port? Should I try the test all on localhost to isolate it from any networking retransmits? Here's a peek at the stats page after about a day of running (this should help demonstrate current loading) http://pastie.org/409632
RE: load balancer and HA
I still don't understand why people stick to heartbeat for things as simple as moving an IP address. Heartbeat is more of a clustering solution, with abilities to perform complex tasks. When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. One reason, heartbeat is standard in many distributions (ie: RHEL, Centos) and vrrp and keepalived are not. It might be overkill for just IP addresses, but being supported in the base OS is a plus that shouldn't be discounted. If you have to support heartbeat on other servers, using heartbeat for places you have to share resources is easier than using vrrp for some and heartbeat on others.
Re: measuring haproxy performance impact
On Fri, Mar 06, 2009 at 11:23:02AM -0800, Michael Fortson wrote: On Fri, Mar 6, 2009 at 8:43 AM, Willy Tarreau w...@1wt.eu wrote: Hi Michael, On Thu, Mar 05, 2009 at 01:04:06PM -0800, Michael Fortson wrote: I'm trying to understand why our proxied requests have a much greater chance of significant delay than non-proxied requests. The server is an 8-core (dual quad) Intel machine. Making requests directly to the nginx backend is just far more reliable. Here's a shell script output that just continuously requests a blank 0k image file from nginx directly on its own port, and spits out a timestamp if the delay isn't 0 or 1 seconds: Thu Mar 5 12:36:17 PST 2009 beginning continuous test of nginx port 8080 Thu Mar 5 12:38:06 PST 2009 Nginx Time is 2 seconds Here's the same test running through haproxy, simultaneously: Thu Mar 5 12:36:27 PST 2009 beginning continuous test of haproxy port 80 Thu Mar 5 12:39:39 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:39:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:39:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:03 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:45 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:40:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:41:55 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:01 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:29 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:42:38 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:43:05 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:43:15 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:25 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:30 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:33 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:39 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:46 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:44:54 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:07 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:16 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:45 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:54 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:45:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:05 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:08 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:32 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:48 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:53 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:46:58 PST 2009 Nginx Time is 3 seconds Thu Mar 5 12:47:40 PST 2009 Nginx Time is 3 seconds 3 seconds is typically a TCP retransmit. You have network losses somewhere from/to your haproxy. Would you happen to be running on a gigabit port connected to a 100 Mbps switch ? What type of NIC is this ? I've seen many problems with broadcom netxtreme 2 (bnx2) caused by buggy firmwares, but it seems to work fine for other people after a firmware upgrade. My sanitized haproxy config is here (mongrel backend was omitted for brevity) : http://pastie.org/408729 Are the ACLs just too expensive? Not at all. Especially in your case. To reach 3 seconds of latency, you would need hundreds of thousands of ACLs, so this is clearly unrelated to your config. Nginx is running with 4 processes, and the box shows mostly idle. ... which indicates that you aren't burning CPU cycles processing ACLs ;-) It is also possible that some TCP settings are too low for your load, but I don't know what your load is. Above a few hundreds-thousands of sessions per second, you will need to do some tuning, otherwise you can end up with similar situations. Regards, Willy Hmm. I think it is gigabit connected to 100 Mb (all Dell rack-mount servers and switches). OK so then please check with ethtool if your port is running in half or full duplex : # ethtool eth0 Most often, 100 Mbps switches are forced to 100-full without autoneg, and gig ports in front of them see them as half thinking they are hubs. The nginx backend runs on the same machine as haproxy and is referenced via 127.0.0.1 -- does that still involve a real network port? Should I try the test all on localhost to isolate it from any networking retransmits? Yes if you can do that, that would be nice. If the issue persists, we'll have to check the network stack tuning, but that's getting harder as it depends on the workload. Also, please provide the output of netstat -s. Here's a peek at the stats page after about a day of running (this should help demonstrate current loading) http://pastie.org/409632 I'm seeing something odd here. A lot of mongrel servers experience connection retries. Are they located on
Re: measuring haproxy performance impact
On Fri, Mar 06, 2009 at 11:49:39AM -0800, Michael Fortson wrote: Oops, looks like it's actually Gb - Gb: http://pastie.org/409653 ah nice ! Here's a netstat -s: http://pastie.org/409652 Oh there are interesting things there : - 513607 failed connection attempts = let's assume it was for dead servers - 34784881 segments retransmited = this is huge, maybe your outgoing bandwidth is limited by the provider, causing lots of drops ? - 8325393 SYN cookies sent = either you've been experiencing a SYN flood attack, or one of your listening socket's backlog is extremely small - 1235433 times the listen queue of a socket overflowed 1235433 SYNs to LISTEN sockets ignored = up to 1.2 million times some client socket experienced a drop, causing at least a 3 seconds delay to establish. The errors your scripts detect certainly account for a small part of those. - 2962458 times recovered from packet loss due to SACK data = many losses, related to second point above. Could you post the output of sysctl -a |grep ^net ? I think that your TCP syn backlog is very low. Your stats page indicate an average of about 300 sessions/s over the last 24 hours. If your external bandwidth is capped and causes drops, you can nearly saturate the default backlog of 1024 with 300 sessions/s each taking 3s to complete. If you're interested, the latest snapshot will report the number of sess/s in the stats. Haproxy and nginx are currently on the same box. Mongrels are all on a private network accessed through eth1 (public access is via eth0). OK. stats page attached (backend everything is not currently in use; it'll be a use-when-full option for fast_mongrels once we upgrade to the next haproxy). According to the stats, your avg output bandwidth is around 10 Mbps. Would this match your external link ? Regards, Willy
Re: measuring haproxy performance impact
On Fri, Mar 6, 2009 at 12:53 PM, Willy Tarreau w...@1wt.eu wrote: On Fri, Mar 06, 2009 at 11:49:39AM -0800, Michael Fortson wrote: Oops, looks like it's actually Gb - Gb: http://pastie.org/409653 ah nice ! Here's a netstat -s: http://pastie.org/409652 Oh there are interesting things there : - 513607 failed connection attempts = let's assume it was for dead servers - 34784881 segments retransmited = this is huge, maybe your outgoing bandwidth is limited by the provider, causing lots of drops ? - 8325393 SYN cookies sent = either you've been experiencing a SYN flood attack, or one of your listening socket's backlog is extremely small - 1235433 times the listen queue of a socket overflowed 1235433 SYNs to LISTEN sockets ignored = up to 1.2 million times some client socket experienced a drop, causing at least a 3 seconds delay to establish. The errors your scripts detect certainly account for a small part of those. - 2962458 times recovered from packet loss due to SACK data = many losses, related to second point above. Could you post the output of sysctl -a |grep ^net ? I think that your TCP syn backlog is very low. Your stats page indicate an average of about 300 sessions/s over the last 24 hours. If your external bandwidth is capped and causes drops, you can nearly saturate the default backlog of 1024 with 300 sessions/s each taking 3s to complete. If you're interested, the latest snapshot will report the number of sess/s in the stats. Haproxy and nginx are currently on the same box. Mongrels are all on a private network accessed through eth1 (public access is via eth0). OK. stats page attached (backend everything is not currently in use; it'll be a use-when-full option for fast_mongrels once we upgrade to the next haproxy). According to the stats, your avg output bandwidth is around 10 Mbps. Would this match your external link ? Regards, Willy Thanks Willy -- here's the sysctl -a |grep ^net output: http://pastie.org/409735 Our outbound cap is 400 Mb
Re: measuring haproxy performance impact
On Fri, Mar 06, 2009 at 01:00:38PM -0800, Michael Fortson wrote: Thanks Willy -- here's the sysctl -a |grep ^net output: http://pastie.org/409735 after a quick check, I see two major things : - net.ipv4.tcp_max_syn_backlog = 1024 = far too low, increase it to 10240 and check if it helps - net.netfilter.nf_conntrack_max = 265535 - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 = this proves that netfiler is indeed running on this machine and might be responsible for session drops. 265k sessions is very low for the large time_wait. It limits to about 2k sessions/s, including local connections on loopback, etc... You should then increase nf_conntrack_max and nf_conntrack_buckets to about nf_conntrack_max/16, and reduce nf_conntrack_tcp_timeout_time_wait to about 30 seconds. Our outbound cap is 400 Mb OK so I think you're still far away from that. Regards, Willy
Re: question about queue and max_conn = 1
Hi Greg, On Fri, Mar 06, 2009 at 03:54:13PM -0500, Greg Gard wrote: hi willy and all, wondering if i can expect haproxy to queue requests when max conn per backend it set to 1. running nginx haproxy mongrel/rails2.2.2. yes, it works fine and is even the recommended way of setting it for use with mongrel. There have been trouble in the past with some old versions, where a request could starve in the queue for too long. What version are you using ? all seems ok, but i am getting a few users complaining of connection problems and never see anything other than zeros in the queue columns. Have you correctly set the maxconn on the server lines ? I suspect you have changed it in the frontend instead, which would be a disaster. Could you please post your configuration ? Regards, Willy
Dropped HTTP Requests
I'm using HAProxy 1.3.15.7 to load-balance three Tomcat instances, and to fork requests for static content to a single Apache instance. I've found that after the initial HTML page is loaded from Tomcat, the browser's subsequent first request for a static image from Apache gets dropped (neither HAProxy nor Apache logs the request, but I can sniff it). The rest of the images after the first load fine. If I create a small, static, test HTML page on Tomcat (making the images come from a different backend), it shows the first image on the page as broken. If I put the exact same HTML page on Apache (no backend switch required), it works fine. I wonder if we have a configuration problem, or perhaps this is a bug in the way HAProxy deals with an HTTP keepalive request that spreads to a second backend? Here is our simple test HTML page: htmlbodyimg src=/header/logo.jpgbrimg src=/header/play_demo.jpg/body/html Both images come from Apache. Again, if we request this HTML from the same backend as the images, it works. If we request this HTML from the Tomcat backend, the request to Apache for logo.jpg gets dropped. Our config follows: global log 127.0.0.1 local0 notice maxconn 4096 chroot /usr/share/haproxy-jail user apache group apache daemon spread-checks 5 defaults log global modehttp option httplog option dontlognull option forwardfor option redispatch retries 3 timeout connect 4s timeout server 20s timeout client 10s timeout http-request 10s frontend all bind :80 acl rp path_beg /rp acl rp_adserving path_beg /rp/javascript.js or path_beg /rp/do/adserving use_backend apache if !rp use_backend tomcat_nosession if rp_adserving default_backend tomcat_session backend apache server LocalApache localhost:81 balance roundrobin option httpchk / stats enable stats hide-version stats uri /hapstat stats realm Haproxy\ Statistics stats authx:xxx stats refresh 5s backend tomcat_session server TomcatA out1:8080 cookie A check slowstart 10s server TomcatB out2:8080 cookie B check slowstart 10s server TomcatC out3:8080 cookie C check slowstart 10s balance roundrobin option httpchk /rp/index.html timeout server 2m cookie JSESSIONID prefix backend tomcat_nosession server TomcatA out1:8080 check slowstart 10s server TomcatB out2:8080 check slowstart 10s server TomcatC out3:8080 check slowstart 10s balance roundrobin option httpchk /rp/index.html
Re: Dropped HTTP Requests
On Fri, Mar 06, 2009 at 04:55:21PM -0500, Timothy Olson wrote: I'm using HAProxy 1.3.15.7 to load-balance three Tomcat instances, and to fork requests for static content to a single Apache instance. I've found that after the initial HTML page is loaded from Tomcat, the browser's subsequent first request for a static image from Apache gets dropped (neither HAProxy nor Apache logs the request, but I can sniff it). The rest of the images after the first load fine. If I create a small, static, test HTML page on Tomcat (making the images come from a different backend), it shows the first image on the page as broken. If I put the exact same HTML page on Apache (no backend switch required), it works fine. I wonder if we have a configuration problem, or perhaps this is a bug in the way HAProxy deals with an HTTP keepalive request that spreads to a second backend? Haproxy does not support HTTP keepalive yet. However it can workaround it using option httpclose, which you should set in your defaults section. What you describe is typically what happens without the option. Regards, Willy
RE: measuring haproxy performance impact
- net.netfilter.nf_conntrack_max = 265535 - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 = this proves that netfiler is indeed running on this machine and might be responsible for session drops. 265k sessions is very low for the large time_wait. It limits to about 2k sessions/s, including local connections on loopback, etc... You should then increase nf_conntrack_max and nf_conntrack_buckets to about nf_conntrack_max/16, and reduce nf_conntrack_tcp_timeout_time_wait to about 30 seconds. Minor nit... He has: net.netfilter.nf_conntrack_count = 0 Which if I am not mistaken, indicates connection tracking although in the kernel, it is not being used. (No firewall rules triggering it).
Re: load balancer and HA
On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote: When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. Actually, it seems that my information is out of date, and we (that is, our IT management company that we outsource our system administration to) are in fact using Keepalived these days. I was confused by the presence of ha_logd on our boxes, which is part of the Heartbeat package; don't know what the one is doing there. So, yeah, you're right. Stick with Keepalived. :-) In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. This depends on a couple of factors: For one, it only works if nothing has yet been sent back to the client. Secondly, it assumes the request itself is repeatable without side effects. The latter, of course, is application-dependent; but following the REST principle, in a well-designed app GET requests are supposed to have no side effects, so they can be retried, whereas POST, PUT etc. cannot. Still expensive and error-prone, of course, but much more pragmatic and limited in scope. Alexander.
Re: measuring haproxy performance impact
On Fri, Mar 06, 2009 at 02:36:59PM -0800, Michael Fortson wrote: On Fri, Mar 6, 2009 at 1:46 PM, Willy Tarreau w...@1wt.eu wrote: On Fri, Mar 06, 2009 at 01:00:38PM -0800, Michael Fortson wrote: Thanks Willy -- here's the sysctl -a |grep ^net output: http://pastie.org/409735 after a quick check, I see two major things : - net.ipv4.tcp_max_syn_backlog = 1024 = far too low, increase it to 10240 and check if it helps - net.netfilter.nf_conntrack_max = 265535 - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 = this proves that netfiler is indeed running on this machine and might be responsible for session drops. 265k sessions is very low for the large time_wait. It limits to about 2k sessions/s, including local connections on loopback, etc... You should then increase nf_conntrack_max and nf_conntrack_buckets to about nf_conntrack_max/16, and reduce nf_conntrack_tcp_timeout_time_wait to about 30 seconds. Our outbound cap is 400 Mb OK so I think you're still far away from that. Regards, Willy Hmm; I did these (John is right, netfilter is down at the moment because I dropped iptables to help troubleshoot this), What did you unload precisely ? You don't need any iptables rules for the conntrack to take effect. so I guess the syn backlog is the only net change. No difference so far -- still seeing regular 3s responses. It's weird, but I actually see better results testing mongrel than nginx; haproxy = mongrel heartbeat is more reliable than the haproxy = nginx request. mongrel is on another machine ? You might be running out of some resource on the local one making it difficult to reach accept(). Unfortunately I don't see what :-( Have you checked with dmesg that you don't have network stack errors or any type of warning ? Willy
Re: load balancer and HA
On Fri, Mar 06, 2009 at 11:47:14PM +0100, Alexander Staubo wrote: On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote: When it comes to just move an IP address between two machines an do nothing else, the VRRP protocol is really better. It's what is implemented in keepalived. Simple, efficient and very reliable. Actually, it seems that my information is out of date, and we (that is, our IT management company that we outsource our system administration to) are in fact using Keepalived these days. I was confused by the presence of ha_logd on our boxes, which is part of the Heartbeat package; don't know what the one is doing there. So, yeah, you're right. Stick with Keepalived. :-) Ah nice! The author will be please to read this, he's subscribed to the list :-) In fact it's useless to synchronise TCP sessions between load-balancers for fast-moving connections (eg: HTTP traffic). Some people require that for long sessions (terminal server, ...) but this cannot be achieved in a standard OS, you need to synchronise every minor progress of the TCP stack with the peer. A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. This depends on a couple of factors: For one, it only works if nothing has yet been sent back to the client. Secondly, it assumes the request itself is repeatable without side effects. The latter, of course, is application-dependent; but following the REST principle, in a well-designed app GET requests are supposed to have no side effects, so they can be retried, whereas POST, PUT etc. cannot. Still expensive and error-prone, of course, but much more pragmatic and limited in scope. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. Those are important to consider because idempotent requests are the only ones a proxy may retry upon a connection error when sending a request on a keep-alive session. IIRC, HEAD, PUT, GET and DELETE were supposed to be idempotent methods. But we all know that GET is not that much when used with CGIs. Willy
Re: load balancer and HA
On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote: A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. Unless you have some kind of shared-memory L4 magic like the original poster talked about, that allows taking over an existing TCP connection. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. That was the exact word I was looking for. I didn't know that PUT was idempotent, but the others make sense. Alexander.
Re: load balancer and HA
On Sat, Mar 07, 2009 at 12:14:44AM +0100, Alexander Staubo wrote: On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote: A less ambitious scheme would have the new proxy take over the client connection and retry the request with the next available backend. Will not work because the connection from the client to the proxy will have been broken during the take-over. The second proxy cannot inherit the primary one's sockets. Unless you have some kind of shared-memory L4 magic like the original poster talked about, that allows taking over an existing TCP connection. in this case of course I agree. But that means kernel-level changes. What you're talking about are idempotent HTTP requests, which are quite well documented in RFC2616. That was the exact word I was looking for. I didn't know that PUT was idempotent, but the others make sense. in fact it also makes sense for PUT because you're supposed to use this method to send a file. Normally, you can send it as many times as you want, the result will not change. Willy
Re: question about queue and max_conn = 1
thanks for taking a look willy. let me know if there's anything else i should change. global maxconn 4096 user haproxy group haproxy daemon log 127.0.0.1local0 notice # http defaults log global retries3 timeoutconnect 5000 timeoutclient 60 timeoutserver 60 stats enable stats auth ggard:buddycat mode http optionhttplog balance roundrobin # option httpclose option httpchk HEAD /check.txt HTTP/1.0 #stable sites run win2k/iis5/asp listen stable 192.168.1.5:10301 option forwardfor server stable1 192.168.1.10:10300 weight 4 check server stable2 192.168.1.11:10300 weight 6 check # beta sites running mongrel/rails2.2.2 listen beta 192.168.1.5:8089 server 7-8091 192.168.1.22:8091 weight 4 maxconn 1 check server 7-8092 192.168.1.22:8092 weight 4 maxconn 1 check server 7-8093 192.168.1.22:8093 weight 4 maxconn 1 check server 7-8094 192.168.1.22:8094 weight 4 maxconn 1 check server 7-8095 192.168.1.22:8095 weight 4 maxconn 1 check server 3-8091 192.168.1.23:8091 weight 6 maxconn 1 check server 3-8092 192.168.1.23:8092 weight 6 maxconn 1 check server 3-8093 192.168.1.23:8093 weight 6 maxconn 1 check server 3-8094 192.168.1.23:8094 weight 6 maxconn 1 check server 3-8095 192.168.1.23:8095 weight 6 maxconn 1 check On Fri, Mar 6, 2009 at 4:51 PM, Willy Tarreau w...@1wt.eu wrote: Hi Greg, On Fri, Mar 06, 2009 at 03:54:13PM -0500, Greg Gard wrote: hi willy and all, wondering if i can expect haproxy to queue requests when max conn per backend it set to 1. running nginx haproxy mongrel/rails2.2.2. yes, it works fine and is even the recommended way of setting it for use with mongrel. There have been trouble in the past with some old versions, where a request could starve in the queue for too long. What version are you using ? all seems ok, but i am getting a few users complaining of connection problems and never see anything other than zeros in the queue columns. Have you correctly set the maxconn on the server lines ? I suspect you have changed it in the frontend instead, which would be a disaster. Could you please post your configuration ? Regards, Willy -- greg gard, psyd www.carepaths.com
Re: question about queue and max_conn = 1
On Fri, Mar 06, 2009 at 10:02:03PM -0500, Greg Gard wrote: thanks for taking a look willy. let me know if there's anything else i should change. (...) defaults (...) # option httpclose This one above should not be commented out. Otherwise, client doing keepalive will artificially maintain a connection to a mongrel when they don't use it, thus preventing another client from using it. #stable sites run win2k/iis5/asp listen stable 192.168.1.5:10301 option forwardfor server stable1 192.168.1.10:10300 weight 4 check server stable2 192.168.1.11:10300 weight 6 check You can also set a maxconn on your iis sites if you think you sometimes hit their connection limit. Maybe maxconn 200 or something like this. The stats will tell you how high you go and if there are errors. The rest looks fine. Regards, Willy