Hi,

On Thu, Jul 14, 2011 at 10:21:56AM +0000, gidot wrote:
> Thanks Willy,
> 
> I managed to fix the problem with the log. It's due to my entry in /etc/
> syslogd.conf. This thread http://www.serverphorums.com/read.php?
> 10,127228,127867 helped me out :).
> 
> Here is the excerpt of my haproxy.log. Hope someone can enlighten me if 
> there's 
> anything obvious from this log that can help me to troubleshoot my problem. 
> Here's from grep'ing 504:
> 
> Jul 13 20:37:57 localhost haproxy[98507]: 213.47.109.71:51261 [13/
> Jul/2011:20:37:07.967] webjailfarm webjailfarm/wj01 7/0/0/-1/+50009 504 +194 
> - 
> - sHVN 62/62/52/12/0 0/0 "GET /main.php?location=war HTTP/1.1" 
(...)

All of these logs indicate that the server is simply not responding within
50 seconds. As frustrating as this can be, this is something quite common
when servers get overloaded or when they try to access a locked resource.
The two following ones however are more concerning :

> Jul 13 20:41:25 localhost haproxy[98507]: 188.123.218.31:4180 [13/
> Jul/2011:20:40:35.132] webjailfarm webjailfarm/wj08 2/0/0/-1/+50005 504 +194 
> - 
> - sHVN 78/78/61/4/0 0/0 "GET /images/gamefavicons.png HTTP/1.1" 
> Jul 13 20:43:36 localhost haproxy[98507]: 217.246.8.81:2284 [13/
> Jul/2011:20:42:46.796] webjailfarm webjailfarm/wj05 1/0/0/-1/+50003 504 +194 
> - 
> - sHVN 56/56/46/0/0 0/0 "GET /emptyicon.gif HTTP/1.1" 

I think that such resources are purely static and have no reason not to be
quickly delivered. Is there any possibility that the same servers are
accessed via other backend sections, or even directly without passing via
haproxy ? I'm asking because what I suspect is that the server's connection
limit is reached due to other activity, but the listening socket is not yet
saturated, so our request lies in the server's backlog until a connection
is released so that a process (or thread) can process the pending request
(which did not happen in time here).

> And some others:
> 
> Jul 13 20:38:08 localhost haproxy[98507]: 89.228.101.118:51199 [13/
> Jul/2011:20:38:08.893] webjailfarm webjailfarm/wj05 0/0/-1/-1/+1 503 +212 - - 
> SCDN 44/44/37/7/+3 0/0 "POST /login.php HTTP/1.1" 

The connection was referencing a server which was already detected as DOWN
(hence the "D" flag), so the health checks have noticed the event. The
connection was redispatched onto another server (wj05) but the connection
failed there. It could be the same thing as above, but with the backlog full,
so the system is rejecting extra connections instead of queuing them. It
could also be that you restarted the server and the connections were attempted
while the port was not bound.

> Jul 13 20:38:14 localhost haproxy[98507]: 188.101.27.150:61567 [13/
> Jul/2011:20:38:14.883] webjailfarm webjailfarm/wj06 13/0/1/-1/+14 502 +204 - 
> - 
> SHVN 53/53/47/7/0 0/0 "GET /js/scriptaculous.js?load=effects,slider HTTP/1.1" 

The 502s normally indicate that the server broke the connection without
responding. This can be the consequence of a server restart as well as it
can indicate dying processes.

> At the moment we're still having problems with clients receiving 502 and 504 
> errors. It was quiet for the first few days after we have tuned the box, but 
> since 2 days ago, they're back.

If you check your stats page, you should see that your servers state are
changing a lot. A server must not flap, it must have a steady state. In
my opinion, the fact that they're seen down is not the cause of the problem
but one of the consequences : something is blocking your servers or making
them process requests slowly and at one point they can't even process health
checks anymore. Requests are aborted on timeouts and checks fail, causing
the server to be marked "down".

This is normally what happens when servers connection limit gets overflown.
You may want to try to increase your MaxClients or equivalent. Be careful
though, as this can imply a higher memory usage.

Another solution people generally like is to split dynamic/static contents,
which is called content switching. You build a farm out of a very fast and
scalable server such as nginx and send the static requests there. You keep
the rest on current servers, the load should drop quite a bit.

> Btw, I tried to run the command echo "show sess" | socat stdio 
> /var/run/haproxy/
> haproxy.sock, and some entries show that it's not forwarding to any server 
> (none). Is this normal?
> 
> [/root] # echo "show sess" | socat stdio /var/run/haproxy/haproxy.sock
> 0x800fbfc00: proto=tcpv4 src=178.190.178.184:52094 fe=webjailfarm 
> be=webjailfarm srv=<none> ts=02 age=46s calls=1 rq
> [f=501000h,l=0,an=0eh,rx=3s,wx=,ax=] rp[f=001000h,l=0,an=00h,rx=,wx=,ax=] s0=
> [7,18h,fd=55,ex=] s1=[0,0h,fd=-1,ex=] exp=3s
(...)

It is normal for connections which have not yet sent a full request. In your
case, the request buffer is empty so nothing was received from the client.
Until you don't see too many of them, you don't have to worry, those can be
port probes or similar. If you have enabled keep-alive on the client side,
it is also possible that those are simply second requests of a single
connection, and that the client maintains it just in case it needs it.

Regards,
Willy


Reply via email to