Re: HAProxy false negatives with tomcat httpchk Layer7 timeout

2011-02-24 Thread Neil Prockter
Hello

I've looked into this further and the trouble is not with tomcat looking
up the client but with tomcat looking up itself.

changing the haproxy config from
  option httpchk GET /
to
  option httpchk GET / HTTP/1.1\r\nHost:\ sys-haproxytomcat-test:8080

makes the server return quickly

Needs some more thought but I think I'd like an option to add a fixed
Host header to the request or maybe take the address:port from the
server backend config for that backend server as that can't be specified
in a single check line

I got the clue from examining wireshark of ab compared to haproxy but I
should have guessed this was the case as I was adding reverse entries
for the backend server not the haproxy server.

correcting the enableLookups typo is unrelated to the issue but thanks
for spotting it. (might be nice if tomcat warned about config that was
wrong)

Thanks

Neil
On 20/02/11 00:09, Cyril Bonté wrote:
 Hi Neil and Willy,
 
 Le dimanche 20 février 2011 01:01:26, Willy Tarreau a écrit :
 But I still don't understand how you would like it to behave differently.
 Haproxy does no DNS lookup during health checks. From what I understand
 of your identification of the issue, it's caused by the tomcat server
 trying to resolve haproxy's address to a name (it's dangerous to have a
 webserver configured like that BTW).
 
 Neil sent me his tomcat configuration yesterday and didn't see an issue but 
 with your last comment, I've reread the file to be sure.
 Well, Neil, there's an error in your Tomcat HTTP connector : please replace 
 enableLookup by enableLookups, that should do the trick ;-)
 


Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer



Re: HAProxy false negatives with tomcat httpchk Layer7 timeout

2011-02-19 Thread Neil Prockter
Hello

I think I've got to the bottom of this, thanks to the straces and some
lucky guesswork.

the trouble occurs when the backend server does not have a entry in the
DNS reverse lookup zone.

I've now got a cluster of identical backends the ones that have reverse
lookup entries come back straight away the ones without take over 5 seconds.

The main thing that tipped me off was running haproxy and tomcat on the
same server (which did not have a reverse lookup entry) the problem
could be solved by adding a etc hosts entry for the form of the name
used in the server config line.

I guess I'll add entries for all my servers but perhaps a fix in haproxy
is also warranted?

Many thanks,

Neil
On 19/02/11 06:42, Willy Tarreau wrote:
 Hello Neil,
 
 On Wed, Feb 16, 2011 at 05:45:00PM +, Neil Prockter wrote:
 Hello

 I using tomcat as a backend server and I'd like to use a httpchk.
 Because tomcat splits the response to the keepalive over a few packets
 haproxy is marking it as down. tshark shows the response is a 200 just
 its not in the first packet.
 
 That's not expected because haproxy 1.4 supports multiple-packet responses.
 
 (...)
 works, uncommenting the httpchk leads to

 [WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN,
 reason: Layer7 timeout, check duration: 2003ms.
 [WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN,
 reason: Layer7 timeout, check duration: 2004ms.
 
 Those really indicate that the read timeout has fired. We cannot exclude
 that you'd have spotted a bug though.
 
 I've looked at the mailing list archives and
 http://marc.info/?l=haproxym=126399109503224w=2 seems relevant but I'm
 still having the same issue.
 
 This one was different, if you notice, it immediately failed precisely
 because haproxy did only consider the first packet.
 
 Could you take a tcpdump capture of the check request/response so that
 we could try to reproduce exactly the same sequence ? Alternatively,
 the output of strace on the running process could also give a lot of
 indications.
 
 Regards,
 Willy
 


Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer



Re: HAProxy false negatives with tomcat httpchk Layer7 timeout

2011-02-19 Thread Neil Prockter
On 19/02/11 23:46, Willy Tarreau wrote:
...
 What fix do you mean ? Haproxy can't force your servers to disable DNS
 resolving.
 
 Regards,
 Willy
Fix is probably the wrong word to have used.  I don't think anything in
haproxy is broken.

To me haproxy's health checks are like a http client.  Other http
clients like ab, links and other web browsers do not seem to act in the
same manner as haproxy does, in that they don't do the same DNS lookups.
 during the testing I did ab got quick fetches from servers regardless
of them having reverse lookup entries.

Maybe, just maybe, haproxy's health checks could act in a similar way.

Regards,

Neil

Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer



Re: HAProxy false negatives with tomcat httpchk Layer7 timeout

2011-02-17 Thread Neil Prockter
Hello
On 16/02/11 20:29, Cyril Bonté wrote:
 Hi Neil,
 
 Le mercredi 16 février 2011 18:45:00, Neil Prockter a écrit :
 Hello

 I using tomcat as a backend server and I'd like to use a httpchk.
 Because tomcat splits the response to the keepalive over a few packets
 haproxy is marking it as down. tshark shows the response is a 200 just
 its not in the first packet.
 
 I don't understand what you mean with keepalive for this checks.
sorry, I meant the httpchk request

 (...)
 #  option httpchk HEAD /
   server  backend-0 backend-0:8080 cookie epd0 check inter 2000 rise 2
 fall 5
   server  backend-1 backend-1:8080 cookie epd1 check inter 2000 rise 2
 fall 5
 (...) 
 uncommenting the httpchk leads to

 [WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN,
 reason: Layer7 timeout, check duration: 2003ms.
 [WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN,
 reason: Layer7 timeout, check duration: 2004ms.
 [ALERT] 046/172221 (28956) : proxy 'epd-rewrite' has no server available!

 Please could someone enlighten me as to which options/changes they've
 found effective.
 
 Are you sure that http://backend-0:8080/ and http://backend-1:8080/ answers 
 in 
 less that 2000 ms ? Your logs say they're longer than that.
apachebench doing 1000 says the longest is 8ms (average 1.4ms). I'm
pretty sure haproxy thinks the httpchk fails because the reply is split
over 2 packets.
 
 Just to check here, which version of tomcat are you using ?
6.0.32 (I tried 6.0.24 first)

 Is your 
 application behind those urls or the ROOT application provided by tomcat ?
That url is backed with just a ROOT with a index.html in it for now so
it returns status 200.  I'll give it a proper test later
 
 Also, nothing with the issue, but :
 - I don't think you want to use cookie  SERVERID rewrite
 - you should use option httpclose or option http-server-close to not have 
 issues with your cookie stickiness.
Thanks I'll look at those, they came to be there just because I started
afresh with the example that comes with the ubuntu package.

Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer



HAProxy false negatives with tomcat httpchk Layer7 timeout

2011-02-16 Thread Neil Prockter
Hello

I using tomcat as a backend server and I'd like to use a httpchk.
Because tomcat splits the response to the keepalive over a few packets
haproxy is marking it as down. tshark shows the response is a 200 just
its not in the first packet.

However I'm confused.

I've used 1.4.8/9 with different options, in a kind of random throw it
at them style, http to get it to work in the past but I'm not clear what
I'm meant to do and I've got a new one to setup and I'm not able to get
it working

=
global
  log 127.0.0.1  local0
  log 127.0.0.1  local1 notice
  maxconn 4096
  user haproxy
  group haproxy
  daemon

defaults
  log  global
  mode  http
  option  httplog
  option  dontlognull
  retries  3
  option redispatch
  maxconn  2000
  contimeout  5000
  clitimeout  5
  srvtimeout  5

listen  epd-rewrite 0.0.0.0:10001
  cookie  SERVERID rewrite
  balance  roundrobin
#  option httpchk HEAD /
  server  backend-0 backend-0:8080 cookie epd0 check inter 2000 rise 2
fall 5
  server  backend-1 backend-1:8080 cookie epd1 check inter 2000 rise 2
fall 5
=
works, uncommenting the httpchk leads to

[WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN,
reason: Layer7 timeout, check duration: 2003ms.
[WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN,
reason: Layer7 timeout, check duration: 2004ms.
[ALERT] 046/172221 (28956) : proxy 'epd-rewrite' has no server available!

Please could someone enlighten me as to which options/changes they've
found effective.

I've looked at the mailing list archives and
http://marc.info/?l=haproxym=126399109503224w=2 seems relevant but I'm
still having the same issue.

Thanks,

Neil

Neil Prockter
Systems Specialist IT Services
London School of Economics and Political Science
n.prock...@lse.ac.uk
+44 (0) 20 7849 4904

Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer