Cyril, thanks for your response.

On Jul 26, 2012, at 4:46 PM, Cyril Bonté wrote:

> 
> Please add "log global" in your backend sections (or in your defaults),
> this explains why your log files didn't give you any indication.
> After adding this line everywhere (not only in the frontend), you'll see when 
> servers go UP and DOWN, and why.
> This will also probably help us to know what happens.

Done.  That helps - we tend to be a bit minimal on most logging for compliance 
reasons but that's certainly going to help.

> From this previous log line, it looks like something becomes slow (your 
> haproxy server or your backend servers).

That's the odd piece, because the server logs don't seem to indicate any issues 
(and this was last seen shortly after a restart).  Hopefully the logging will 
shed more light on the subject.

> Wow, are you sure you really want to use such a big buffer size ? Also, 
> ensure you're running the last stable version of HAProxy (currently 1.4.21), 
> which fixes a major bug when using a larger buffer size (it doesn't explain 
> what you observe but it's an advise for more stability).

Unfortunately yes; we are supporting some rare but critical very large HTTP GET 
requests over a REST API.  I'll look into upgrading shortly.

> For more details :
> http://haproxy.1wt.eu/git?p=haproxy-1.4.git;a=commit;h=30297cb17147a8d339eb160226bcc08c91d9530b

Good to know!

> As said at the beginning, please add :
>       log global

Added.  Is there a useful log level configuration that outputs exactly what the 
defaults are without the stats connection lines?

> If using cookies is not an issue for your clients, I'd recommend you not to 
> use "appsession" but "cookie insert" or "cookie prefix" instead.
> 
> http://cbonte.github.com/haproxy-dconv/configuration-1.4.html#4-cookie

We'll probably leave this one just because I don't believe its giving us any 
issues at the moment; the client problem definitely coincided with the backend 
being marked as DOWN rather than a cookie issue (the cookie issue was my first 
guess to be honest).

>>     server gui1 172.25.200.53:8080 check maxconn 2000
>>     server gui2 172.25.200.54:8080 check maxconn 2000
> 
> You didn't provide any "timeout check" nor "inter" value.
> The default will be 2 seconds, which is maybe too low for your case.

It shouldn't be - our healthcheck page is fairly simple and just basically 
makes sure that our webapp is responding to requests (barely more than a static 
file) - I've upped "timeout check" to 10000 though, so we'll see if that makes 
a difference.

> Hope this helps.

It did, and thank you for looking at this.  I've learned an awful lot about 
haproxy configuration setups (good and bad) from this list!

-Richard

Reply via email to