Hi Steve, On Tue, Jan 24, 2012 at 08:55:15AM -0800, Steve V wrote: > Good morning, > > Much love for haproxy and many thanks to all who have worked on and > contributed to it. We have been using it for several years without issue. > However, we have been doing load testing lately and there appears to be a > bottleneck. It may not even have to do with haproxy (i dont think it does) > but i need to double check anyways just to be thorough and cover all our > bases. > > Hardware: VM running on ESXi, it has 2gigs RAM allocated to it, and 2 CPU's > GuestOS: CentOS 5 > Haproxy version: 1.4.8 (however, we just upgraded to 1.4.19 last night) > > Problem: "second_proxy" is getting hammered by a load test, site > performance decreases to the point where the site is barely usable and the > majority of pages time out. however, go to a different site that is in the > same haproxy config listening on "http_proxy" going to the same backend > server, and the site comes up fine and fast. it seems like something is > being throttled or queued somewhere. its possible that it could be an > issue behind haproxy on the app servers, but i just want to make sure there > is nothing i need to tweak in my config. > > Here is a snapshot of the haproxy stats page for the slow pool > "second_proxy" http://tinypic.com/r/15887qf/5
Did you tune any sysctl on your system ? Your snapshot reports a peak of 1600 conns/second, but the default kernel settings (somaxconn 128 and tcp_max_syn_backlog 1024) make this hard to reach, so it's very possible that the socket queue is simply full. I'm used to set both between 10000 and 20000 with good success. There is something you can try to detect if haproxy still accepts connections fine : simply try to connect to the stats URL on the unresponding port. If the stats display properly, then you're stuck on the servers. If the stats do not respond either, then the connection is not accepted. Be careful, you have no "maxconn" setting in the "defaults" section, and by default a listen uses 2000. I'm seeing that your snapshot indicates that this limit was not reached, still I wanted to let you know it's going to be the next issue once this one is resolved. > here is my haproxy.cfg > > global > maxconn 8096 > daemon > nbproc 1 > stats socket /var/run/haproxy.stat > > defaults > clitimeout 600000 > srvtimeout 600000 Do you realize that this is 10 minutes (we're speaking HTTP here) ? Regards, Willy

