I added in the tun.bufsize 65536 and right away things got better, I doubled that to 131072 and all of the outliers went way. Set at that with my tests it looks like haproxy is faster then nginx on 95% of responses and on par with nginx for the last 5% which is fine with me =).
What is the negative to setting this high like that? If its just ram usage all of our LBs have 16GB of ram (don't ask why) so if thats all I don't think it will be an issue having that so high. Matt C. On Thu, Jun 9, 2011 at 2:11 PM, Willy Tarreau <[email protected]> wrote: > Hi Matt, > > On Thu, Jun 09, 2011 at 01:50:11PM -0700, Matt Christiansen wrote: >> Hi Willy, >> >> I agree the haproxy logs show that, but we also monitor the time spent >> processing the request which takes in to account, GC, reading data off >> the FS and a number of things inside the app and I see no 3sec times >> in there or anything near it. Also I have no 3 sec outliers in output >> from my test so that seems a little weird it says 3secs. > > What I really hate with 3sec is that it's the common TCP retransmit time, > and normally it indicates packet losses. I have implicitly excluded that > possibility since it runs well with nginx on the same machine, but still > that must not be omitted. > > Still, the time measured by application servers generally does not include > the time spent in queues, so you should be very careful with this. For all > components there will always be an un monitored area. For instance, haproxy > cannot know the time spent by the request in the system's backlog, which can > be huge under a syn flood attack or when maxconn is too low. > >> Also I have >> the connections set really high to prevent queueing for now, we >> usually only have around 1000-2000 connections open. >> >> uname -a >> >> Linux 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 >> x86_64 x86_64 GNU/Linux > > OK, RH5 so I agree you won't do TCP splicing on this one. > > Could you check if the number of TCP retransmits increases between two > runs (with netstat -s) ? It's worth archiving a full copy before and > after the dump in order to focus on things we could discover there. > > Also, would you happen to have nf_conntrack running (check with lsmod) ? > When this is the case, we always have very ugly results, but it mainly > affects connect times and in your case I saw large response times too. > >> haproxy -vv >> >> HA-Proxy version 1.4.15 2011/04/08 >> Copyright 2000-2010 Willy Tarreau <[email protected]> >> >> Build options : >> TARGET = linux26 >> CPU = generic >> CC = gcc >> CFLAGS = -m64 -march=x86-64 -O2 -g -fno-strict-aliasing >> OPTIONS = USE_PCRE=1 > > Everything's fine here. > (...) > >> My config > > Everything OK here too. You said that numbers slightly improved > with tcp-smart-accept and tcp-smart-connect. Normally it can be > caused by congested network or by losses. What really puzzles me > is that while those issues are very common, I don't see why they > wouldn't show up with nginx too. > > Oh one thing I forgot which can make a difference : buffer sizes. > The larger the buffer, the smoother losses will be absorbed because > they'll induce fewer timeouts/RTTs. I don't know what size nginx > uses, but I remember it has dynamic buffer sizes. Haproxy defaults > to 16 kB. You can try to increase to 64 kB and see if it changes > anything : > > global > tune.bufsize 65536 > > Maybe you should run a tcpdump between haproxy and the server, or > even better, on the haproxy machine AND on one of the servers (you > can disable a number of servers if it's a test config). That way > we'll know how the response time spreads around. > > Regards, > Willy > >

