On Tue, May 27, 2014 at 08:27:52PM +0200, Jakov Sosic wrote:
> On 05/27/2014 08:21 PM, Willy Tarreau wrote:
>
> >What is happening here is simple : the client disconnected before the
> >connection to the server managed to complete ("CC" flags), and you're
> >running with "option abortonclose" which allows haproxy to kill a pending
> >connection to the server.
> >
> >Given how short the request lasted, I guess that it's a script that sent
> >this connection. It's basically sending the request and closing the output
> >channel immediately, waiting for the response. You can get this behaviour
> >using printf "$request" | nc $site 80. It's very likely a bot sucking your
> >site, as browsers never ever do that.
> >
> >Using halog to sort them by IP will probably reveal that most of them
> >come from a few IP addresses. For this you can run "halog -hs 503 -ic <
> >log".
>
> Yeah I was suspecting it's the client closing connection and was even
> planning on commenting out abortonclose later tonight in off hours (I'm
> running European +1 CEST based web site) :) So a great catch Willy!
>
> What started all this is that I have around 3-4% error rate from
> GoogleBot (Googlebot can't access your site), and bosses/devs want to
> lower/eliminate that and found culprit in HaProxys 503 errors.
I don't see why GoogleBot would see them since they should only affect
the offending clients.
> Is it by any chance possible that my ISP is somehow screwing up
> connections? Because I see this kind of aborts/503s even from regular
> clients fetching regular stuff?
Could be possible, but that sounds really strange. You could easily check
though, if you own a machine somewhere outside your ISP's network. Simply
send a request from there to your site and sniff at both ends. You'll see
if the trace matches or not. It could be possible that the ISP is running
a misconfigured transparent proxy which systematically closes the request
path after sending the request (as haproxy used to do with option forceclose
in early version 1.1 12 years ago). Or maybe it's part of an IDS or anti-ddos
mechanism that's automatically enabled when they run into trouble.
Willy