We have the same exact problem, only that dontlognull is not working for us
neither for some reason (we have 1 byte requests containing a single byte -
NULL - maybe something affected from a firewall or other device tunneling
the traffic to our LB).

We also concluded this is something the browsers are doing automatically
(especially when "option http-server-close" is in use).

If you find any other solution not including the dontlognull option we
would love to hear about it.


On Wed, Dec 26, 2012 at 11:38 PM, Willy Tarreau <[email protected]> wrote:

> Hi Brendon,
>
> On Wed, Dec 26, 2012 at 03:09:37PM -0500, Brendon Colby wrote:
> > Greetings!
> >
> > (Apparently GMail IPs are now listed in SORBs, so when I first sent this
> > through the Gmail web interface I got a bounce; I had to use Thunderbird
> > and IMAP.)
> >
> > We just replaced our old commercial load balancer (end of life) with
> > haproxy running on an HP DL360p G8 server with a quad-core Xeon E5-2609
> > 2.4GHz processor (soon to be two for redundancy). Our site is mainly
> > media - SWF files of movies and games, mp4 movies, audio, art, etc. We
> > are serving approximately 2,330 req/s right now, for example, at over
> > 1Gbps of outgoing traffic (we average 600Mbps to 1Gbps+ daily).
> > Everything has been running great for a couple of weeks, and now we're
> > just tidying up a few things like logging and these request error counts
> > we've been seeing. We have multiple sub-domains: one hosting css, one
> > hosting images, etc.
> >
> > My goals have been to:
> >
> > 1. Figure out what's causing the request errors and see if we can tweak
> > something on the server side to stop them
> > 2. Reduce the error log output to a manageable level, so we can keep it
> > enabled all the time and monitor it
> >
> > We haven't been able to keep full error logging enabled due to the
> > volume of errors being logged (200-300 per second).
> >
> > When we first enabled error logging it was flooded with entries like
> these:
> >
> > Dec 21 21:00:01 localhost haproxy[16034]: x.x.x.x:50714
> > [21/Dec/2012:20:59:56.140] http-in http-in/<NOSRV> -1/-1/-1/-1/12343 400
> > 212 - - CR-- 1913/1903/0/0/0 0/0 "<BADREQ>"
> >
> > This isn't the exact error but close - a CR error with a seemingly
> > random timeout. "show errors" reported nothing of value.
> >
> > Here's what our defaults looked like:
> >
> > defaults
> >     backlog 10000
> >     mode http
> >     # This option helps provide better statistics / graphs.
> >     option contstats
> >     option splice-auto
> >     timeout client 30s
> >     timeout connect 10s
> >     timeout server 30s
> >
> > I did some research and it seemed to me that these errors were simply
> > the browser closing connections. For example, when I opened our site
> > with Chrome, after about 12 seconds I would see several CR errors in the
> > logs from our office IP. I figured these must be keep-alive connections
> > (or something like that) that were just timing out on the client side
> > (hence the CR or client side error).
> >
> > I put in a "timeout http-request" of five seconds and haproxy then
> > started logging cR errors with a 5000ms timeout value (the same log
> > values above essentially, with cR instead of CR). What this told me was
> > that now instead of the client disconnecting (CR) haproxy was
> > proactively disconnecting (cR) and throwing a 408 error, which made
> > sense. Right now http-request is set to 15 seconds and we're still
> > seeing 100s of errors per second.
> >
> > Next, I read through almost the entire haproxy manual (very good docs!)
> > and found this section under "timeout http-keep-alive": "There is also
> > another difference between the two timeouts : when a connection expires
> > during timeout http-keep-alive, no error is returned, the
> > connection just closes. If the connection expires in "http-request"
> > while waiting for a connection to complete, a HTTP 408 error is
> returned."
> >
> > I thought for sure using "timeout http-keep-alive 1" would stop the
> > cR/408 errors, but it didn't. "option dontlognull" stops them from being
> > logged, but I see we're still getting 200-300 request errors per second
> > on the frontend.
> >
> > The docs say not to use "option dontlognull" because it could mask
> > attacks and such. I agree with this and don't want to leave this in.
> > What's puzzling is that these cR/408 errors seem to be coming from
> > regular site traffic and browser behavior, not an attack.
> >
> > Unless I'm mistaken, the way we have haproxy configured right now, we
> > shouldn't be seeing these cR/408 errors. I will post the relevant pieces
> > of our configuration below. Please let me know if I'm missing something
> > here, because at this point I'm stuck!
>
> Thank you for the very well detailed analysis. I believe that some browsers
> nowadays tend to proactively establish connections to visited websites,
> just
> in case they will need them later. Since Chrome does everything it can to
> reduce page load time, it very likely is something it's doing. This could
> explain exactly what you're observing : a new connection over which nothing
> is transferred that it closed when the user leaves the site (hence the
> random delay).
>
> If you can reproduce the behaviour with your browser, I think that
> dontlognull
> will be your only solution and that we'll have to update the doc to
> indicate
> that browsers have adopted such an internet-unfriendly behaviour that it's
> better to leave the option on. What I don't like with proactively opened
> connections is that they're killing servers with 10-100 times the load they
> would have to sustain and that even small sites might experience issues
> with
> this. If you see 200 of them per second and they last 5s on average, it
> means
> you're constantly having 5000 idle connections just because of this. Many
> web servers can't handle this :-/
>
> BTW, your config is really clean, I have nothing to suggest. I wouldn't
> be surprized if some people reuse it to build their own configs :-)
>
> Best regards,
> Willy
>
>
>

Reply via email to