Re: HAProxy false negatives with tomcat httpchk Layer7 timeout
Hello I've looked into this further and the trouble is not with tomcat looking up the client but with tomcat looking up itself. changing the haproxy config from option httpchk GET / to option httpchk GET / HTTP/1.1\r\nHost:\ sys-haproxytomcat-test:8080 makes the server return quickly Needs some more thought but I think I'd like an option to add a fixed Host header to the request or maybe take the address:port from the server backend config for that backend server as that can't be specified in a single check line I got the clue from examining wireshark of ab compared to haproxy but I should have guessed this was the case as I was adding reverse entries for the backend server not the haproxy server. correcting the enableLookups typo is unrelated to the issue but thanks for spotting it. (might be nice if tomcat warned about config that was wrong) Thanks Neil On 20/02/11 00:09, Cyril Bonté wrote: Hi Neil and Willy, Le dimanche 20 février 2011 01:01:26, Willy Tarreau a écrit : But I still don't understand how you would like it to behave differently. Haproxy does no DNS lookup during health checks. From what I understand of your identification of the issue, it's caused by the tomcat server trying to resolve haproxy's address to a name (it's dangerous to have a webserver configured like that BTW). Neil sent me his tomcat configuration yesterday and didn't see an issue but with your last comment, I've reread the file to be sure. Well, Neil, there's an error in your Tomcat HTTP connector : please replace enableLookup by enableLookups, that should do the trick ;-) Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
Re: HAProxy false negatives with tomcat httpchk Layer7 timeout
Hello I think I've got to the bottom of this, thanks to the straces and some lucky guesswork. the trouble occurs when the backend server does not have a entry in the DNS reverse lookup zone. I've now got a cluster of identical backends the ones that have reverse lookup entries come back straight away the ones without take over 5 seconds. The main thing that tipped me off was running haproxy and tomcat on the same server (which did not have a reverse lookup entry) the problem could be solved by adding a etc hosts entry for the form of the name used in the server config line. I guess I'll add entries for all my servers but perhaps a fix in haproxy is also warranted? Many thanks, Neil On 19/02/11 06:42, Willy Tarreau wrote: Hello Neil, On Wed, Feb 16, 2011 at 05:45:00PM +, Neil Prockter wrote: Hello I using tomcat as a backend server and I'd like to use a httpchk. Because tomcat splits the response to the keepalive over a few packets haproxy is marking it as down. tshark shows the response is a 200 just its not in the first packet. That's not expected because haproxy 1.4 supports multiple-packet responses. (...) works, uncommenting the httpchk leads to [WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN, reason: Layer7 timeout, check duration: 2003ms. [WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN, reason: Layer7 timeout, check duration: 2004ms. Those really indicate that the read timeout has fired. We cannot exclude that you'd have spotted a bug though. I've looked at the mailing list archives and http://marc.info/?l=haproxym=126399109503224w=2 seems relevant but I'm still having the same issue. This one was different, if you notice, it immediately failed precisely because haproxy did only consider the first packet. Could you take a tcpdump capture of the check request/response so that we could try to reproduce exactly the same sequence ? Alternatively, the output of strace on the running process could also give a lot of indications. Regards, Willy Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
Re: HAProxy false negatives with tomcat httpchk Layer7 timeout
On 19/02/11 23:46, Willy Tarreau wrote: ... What fix do you mean ? Haproxy can't force your servers to disable DNS resolving. Regards, Willy Fix is probably the wrong word to have used. I don't think anything in haproxy is broken. To me haproxy's health checks are like a http client. Other http clients like ab, links and other web browsers do not seem to act in the same manner as haproxy does, in that they don't do the same DNS lookups. during the testing I did ab got quick fetches from servers regardless of them having reverse lookup entries. Maybe, just maybe, haproxy's health checks could act in a similar way. Regards, Neil Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
Re: HAProxy false negatives with tomcat httpchk Layer7 timeout
Hello On 16/02/11 20:29, Cyril Bonté wrote: Hi Neil, Le mercredi 16 février 2011 18:45:00, Neil Prockter a écrit : Hello I using tomcat as a backend server and I'd like to use a httpchk. Because tomcat splits the response to the keepalive over a few packets haproxy is marking it as down. tshark shows the response is a 200 just its not in the first packet. I don't understand what you mean with keepalive for this checks. sorry, I meant the httpchk request (...) # option httpchk HEAD / server backend-0 backend-0:8080 cookie epd0 check inter 2000 rise 2 fall 5 server backend-1 backend-1:8080 cookie epd1 check inter 2000 rise 2 fall 5 (...) uncommenting the httpchk leads to [WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN, reason: Layer7 timeout, check duration: 2003ms. [WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN, reason: Layer7 timeout, check duration: 2004ms. [ALERT] 046/172221 (28956) : proxy 'epd-rewrite' has no server available! Please could someone enlighten me as to which options/changes they've found effective. Are you sure that http://backend-0:8080/ and http://backend-1:8080/ answers in less that 2000 ms ? Your logs say they're longer than that. apachebench doing 1000 says the longest is 8ms (average 1.4ms). I'm pretty sure haproxy thinks the httpchk fails because the reply is split over 2 packets. Just to check here, which version of tomcat are you using ? 6.0.32 (I tried 6.0.24 first) Is your application behind those urls or the ROOT application provided by tomcat ? That url is backed with just a ROOT with a index.html in it for now so it returns status 200. I'll give it a proper test later Also, nothing with the issue, but : - I don't think you want to use cookie SERVERID rewrite - you should use option httpclose or option http-server-close to not have issues with your cookie stickiness. Thanks I'll look at those, they came to be there just because I started afresh with the example that comes with the ubuntu package. Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer
HAProxy false negatives with tomcat httpchk Layer7 timeout
Hello I using tomcat as a backend server and I'd like to use a httpchk. Because tomcat splits the response to the keepalive over a few packets haproxy is marking it as down. tshark shows the response is a 200 just its not in the first packet. However I'm confused. I've used 1.4.8/9 with different options, in a kind of random throw it at them style, http to get it to work in the past but I'm not clear what I'm meant to do and I've got a new one to setup and I'm not able to get it working = global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 5 srvtimeout 5 listen epd-rewrite 0.0.0.0:10001 cookie SERVERID rewrite balance roundrobin # option httpchk HEAD / server backend-0 backend-0:8080 cookie epd0 check inter 2000 rise 2 fall 5 server backend-1 backend-1:8080 cookie epd1 check inter 2000 rise 2 fall 5 = works, uncommenting the httpchk leads to [WARNING] 046/172220 (28956) : Server epd-rewrite/backend-0 is DOWN, reason: Layer7 timeout, check duration: 2003ms. [WARNING] 046/172221 (28956) : Server epd-rewrite/backend-1 is DOWN, reason: Layer7 timeout, check duration: 2004ms. [ALERT] 046/172221 (28956) : proxy 'epd-rewrite' has no server available! Please could someone enlighten me as to which options/changes they've found effective. I've looked at the mailing list archives and http://marc.info/?l=haproxym=126399109503224w=2 seems relevant but I'm still having the same issue. Thanks, Neil Neil Prockter Systems Specialist IT Services London School of Economics and Political Science n.prock...@lse.ac.uk +44 (0) 20 7849 4904 Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer