Re: Backend redirection issue - log interpretation help please
Cyril, thanks for your response. On Jul 26, 2012, at 4:46 PM, Cyril Bonté wrote: Please add log global in your backend sections (or in your defaults), this explains why your log files didn't give you any indication. After adding this line everywhere (not only in the frontend), you'll see when servers go UP and DOWN, and why. This will also probably help us to know what happens. Done. That helps - we tend to be a bit minimal on most logging for compliance reasons but that's certainly going to help. From this previous log line, it looks like something becomes slow (your haproxy server or your backend servers). That's the odd piece, because the server logs don't seem to indicate any issues (and this was last seen shortly after a restart). Hopefully the logging will shed more light on the subject. Wow, are you sure you really want to use such a big buffer size ? Also, ensure you're running the last stable version of HAProxy (currently 1.4.21), which fixes a major bug when using a larger buffer size (it doesn't explain what you observe but it's an advise for more stability). Unfortunately yes; we are supporting some rare but critical very large HTTP GET requests over a REST API. I'll look into upgrading shortly. For more details : http://haproxy.1wt.eu/git?p=haproxy-1.4.git;a=commit;h=30297cb17147a8d339eb160226bcc08c91d9530b Good to know! As said at the beginning, please add : log global Added. Is there a useful log level configuration that outputs exactly what the defaults are without the stats connection lines? If using cookies is not an issue for your clients, I'd recommend you not to use appsession but cookie insert or cookie prefix instead. http://cbonte.github.com/haproxy-dconv/configuration-1.4.html#4-cookie We'll probably leave this one just because I don't believe its giving us any issues at the moment; the client problem definitely coincided with the backend being marked as DOWN rather than a cookie issue (the cookie issue was my first guess to be honest). server gui1 172.25.200.53:8080 check maxconn 2000 server gui2 172.25.200.54:8080 check maxconn 2000 You didn't provide any timeout check nor inter value. The default will be 2 seconds, which is maybe too low for your case. It shouldn't be - our healthcheck page is fairly simple and just basically makes sure that our webapp is responding to requests (barely more than a static file) - I've upped timeout check to 1 though, so we'll see if that makes a difference. Hope this helps. It did, and thank you for looking at this. I've learned an awful lot about haproxy configuration setups (good and bad) from this list! -Richard
Re: Backend redirection issue - log interpretation help please
Hi Richard, Le 26/07/2012 13:46, Richard Stanford a écrit : This morning I was able to reproduce it even after restarting our appserver proceses and our haproxy process (which had been up for an extended period of time - thanks!). When it reproduced, I noticed that the backend appserver was marked as DOWN in the statistics report even though the logfiles gave no indication that there were any problems. Please add log global in your backend sections (or in your defaults), this explains why your log files didn't give you any indication. After adding this line everywhere (not only in the frontend), you'll see when servers go UP and DOWN, and why. This will also probably help us to know what happens. From the user's experience, the system hung waiting for a response for some 40 seconds and then showed a login screen, which was served when the incorrect backend received a page request. Simply re-requesting the previous page was successful, as the previously authenticated backend was shown as UP by that point. In the logfile we see the following entry at the appropriate time: Jul 26 06:21:09 localhost haproxy[3853]: 172.25.200.177:3124 [26/Jul/2012:06:21:01.826] secure gui/gui2 4993/0/0/-1/7992 -1 0 - - CH-- 40/40/1/1/0 0/0 HEAD /panel/healthcheck.html HTTP/1.1 From this previous log line, it looks like something becomes slow (your haproxy server or your backend servers). Configuration file below in full in case there's something odd: global maxconn 2 tune.bufsize 128000 Wow, are you sure you really want to use such a big buffer size ? Also, ensure you're running the last stable version of HAProxy (currently 1.4.21), which fixes a major bug when using a larger buffer size (it doesn't explain what you observe but it's an advise for more stability). For more details : http://haproxy.1wt.eu/git?p=haproxy-1.4.git;a=commit;h=30297cb17147a8d339eb160226bcc08c91d9530b defaults mode http timeout connect 5s timeout client 1m timeout server 10m option srvtcpka option http-server-close option http-pretend-keepalive option redispatch option abortonclose option httpchk HEAD /panel/healthcheck.txt HTTP/1.0 As said at the beginning, please add : log global backend gui balance source appsession JSESSIONID len 64 timeout 3h If using cookies is not an issue for your clients, I'd recommend you not to use appsession but cookie insert or cookie prefix instead. http://cbonte.github.com/haproxy-dconv/configuration-1.4.html#4-cookie server gui1 172.25.200.53:8080 check maxconn 2000 server gui2 172.25.200.54:8080 check maxconn 2000 You didn't provide any timeout check nor inter value. The default will be 2 seconds, which is maybe too low for your case. http://cbonte.github.com/haproxy-dconv/configuration-1.4.html#inter http://cbonte.github.com/haproxy-dconv/configuration-1.4.html#timeout%20check Hope this helps. -- Cyril Bonté