[ Mail to the list appeared blocked last time I tried due to use of SORBS as a definite indicator of spam. Admin - this is not a good idea, especially when large mail provider addresses get blacklisted! ]
On 11 February 2013 16:25, Amol <[email protected]> wrote: > Hi, > so i have about 6 app servers running apache and 1 load balancer running > haproxy 1.4.10 [ Think about upgrading. You're running a 2.5 year-old version of the stable branch. ] > the issue i see lately is that even if one of the app server is having > issues such as running out of memory or disk space etc..the load balancer > keeps send the traffic to that app server > should it not detect if the app server is unreachable or not responding then > do not send traffic to the server? If your server is responding with an HTTP 200 for the check.txt file, then the server will be considered "up" as far as HAProxy is concerned. I suggest you should make the backend requests for your check URI run code and check states that your application server will rely on for normal requests. In your situation, unless you're running a static-file-from-disk download service, this doesn't look like it's the case. I can easily imagine your backends serving check.txt from page cache whilst *actual* requests, which make DB requests / run code / call out to other services, are timing out. Whether that's from resource contention, IO starvation, network constraints or whatever it might be - it doesn't matter. If you make your check page rely on those same resources, then HAProxy will take a backend server out of rotation at the same time that normal requests start failing. Of course, your check page can skip the "boring" parts of normal page rendering, like wrapping it up into HTML and putting correct image links in. Or can it? What happens if you, say, upgrade an HTML-generation library on one machine as a test? Personally, I'd *want* the check page to fail at that point. The art of making a really good service check page is an interesting one. Too much detail or overhead, and you lose the granularity of making HAProxy confirm a backend is up every, say, couple of seconds (that's how long I like to configure). Too /little/ detail, and you may carry on serving an HTTP 200 for the check page when the rest of the server is broken. It totally comes down to your judgement, and the operational trade-offs you and the business around you need to make. Cheers, Jonathan -- Jonathan Matthews // Oxford, London, UK http://www.jpluscplusm.com/contact.html

