We have the same exact problem, only that dontlognull is not working for us neither for some reason (we have 1 byte requests containing a single byte - NULL - maybe something affected from a firewall or other device tunneling the traffic to our LB).
We also concluded this is something the browsers are doing automatically (especially when "option http-server-close" is in use). If you find any other solution not including the dontlognull option we would love to hear about it. On Wed, Dec 26, 2012 at 11:38 PM, Willy Tarreau <[email protected]> wrote: > Hi Brendon, > > On Wed, Dec 26, 2012 at 03:09:37PM -0500, Brendon Colby wrote: > > Greetings! > > > > (Apparently GMail IPs are now listed in SORBs, so when I first sent this > > through the Gmail web interface I got a bounce; I had to use Thunderbird > > and IMAP.) > > > > We just replaced our old commercial load balancer (end of life) with > > haproxy running on an HP DL360p G8 server with a quad-core Xeon E5-2609 > > 2.4GHz processor (soon to be two for redundancy). Our site is mainly > > media - SWF files of movies and games, mp4 movies, audio, art, etc. We > > are serving approximately 2,330 req/s right now, for example, at over > > 1Gbps of outgoing traffic (we average 600Mbps to 1Gbps+ daily). > > Everything has been running great for a couple of weeks, and now we're > > just tidying up a few things like logging and these request error counts > > we've been seeing. We have multiple sub-domains: one hosting css, one > > hosting images, etc. > > > > My goals have been to: > > > > 1. Figure out what's causing the request errors and see if we can tweak > > something on the server side to stop them > > 2. Reduce the error log output to a manageable level, so we can keep it > > enabled all the time and monitor it > > > > We haven't been able to keep full error logging enabled due to the > > volume of errors being logged (200-300 per second). > > > > When we first enabled error logging it was flooded with entries like > these: > > > > Dec 21 21:00:01 localhost haproxy[16034]: x.x.x.x:50714 > > [21/Dec/2012:20:59:56.140] http-in http-in/<NOSRV> -1/-1/-1/-1/12343 400 > > 212 - - CR-- 1913/1903/0/0/0 0/0 "<BADREQ>" > > > > This isn't the exact error but close - a CR error with a seemingly > > random timeout. "show errors" reported nothing of value. > > > > Here's what our defaults looked like: > > > > defaults > > backlog 10000 > > mode http > > # This option helps provide better statistics / graphs. > > option contstats > > option splice-auto > > timeout client 30s > > timeout connect 10s > > timeout server 30s > > > > I did some research and it seemed to me that these errors were simply > > the browser closing connections. For example, when I opened our site > > with Chrome, after about 12 seconds I would see several CR errors in the > > logs from our office IP. I figured these must be keep-alive connections > > (or something like that) that were just timing out on the client side > > (hence the CR or client side error). > > > > I put in a "timeout http-request" of five seconds and haproxy then > > started logging cR errors with a 5000ms timeout value (the same log > > values above essentially, with cR instead of CR). What this told me was > > that now instead of the client disconnecting (CR) haproxy was > > proactively disconnecting (cR) and throwing a 408 error, which made > > sense. Right now http-request is set to 15 seconds and we're still > > seeing 100s of errors per second. > > > > Next, I read through almost the entire haproxy manual (very good docs!) > > and found this section under "timeout http-keep-alive": "There is also > > another difference between the two timeouts : when a connection expires > > during timeout http-keep-alive, no error is returned, the > > connection just closes. If the connection expires in "http-request" > > while waiting for a connection to complete, a HTTP 408 error is > returned." > > > > I thought for sure using "timeout http-keep-alive 1" would stop the > > cR/408 errors, but it didn't. "option dontlognull" stops them from being > > logged, but I see we're still getting 200-300 request errors per second > > on the frontend. > > > > The docs say not to use "option dontlognull" because it could mask > > attacks and such. I agree with this and don't want to leave this in. > > What's puzzling is that these cR/408 errors seem to be coming from > > regular site traffic and browser behavior, not an attack. > > > > Unless I'm mistaken, the way we have haproxy configured right now, we > > shouldn't be seeing these cR/408 errors. I will post the relevant pieces > > of our configuration below. Please let me know if I'm missing something > > here, because at this point I'm stuck! > > Thank you for the very well detailed analysis. I believe that some browsers > nowadays tend to proactively establish connections to visited websites, > just > in case they will need them later. Since Chrome does everything it can to > reduce page load time, it very likely is something it's doing. This could > explain exactly what you're observing : a new connection over which nothing > is transferred that it closed when the user leaves the site (hence the > random delay). > > If you can reproduce the behaviour with your browser, I think that > dontlognull > will be your only solution and that we'll have to update the doc to > indicate > that browsers have adopted such an internet-unfriendly behaviour that it's > better to leave the option on. What I don't like with proactively opened > connections is that they're killing servers with 10-100 times the load they > would have to sustain and that even small sites might experience issues > with > this. If you see 200 of them per second and they last 5s on average, it > means > you're constantly having 5000 idle connections just because of this. Many > web servers can't handle this :-/ > > BTW, your config is really clean, I have nothing to suggest. I wouldn't > be surprized if some people reuse it to build their own configs :-) > > Best regards, > Willy > > >

