FW: haproxy conditional healthchecks/failover

2012-05-29 Thread Zulu Chas

am I wildly off course or is this config salvageable?






  Hi!
 
  I'm trying to use HAproxy to support the concepts of offline, in
  maintenance mode, and not working servers.
 
 Any good reason to do that???
 (I'm a bit curious)

Sure.  I want to be able to mark a machine offline by creating a file (as 
opposed to marking it online by creating a file), which is why I can't use 
disable-on-404 below.  This covers situations where I need to take a machine 
out of public-facing operation for some reason, but perhaps I still want it to 
be able to render pages etc -- maybe I'm testing a code deployment once it's 
already deployed in order to verify the system is ready to be marked online.
I also want to be able to mark a machine down for maintenance by creating a 
file, maintenance.html, which apache will nicely rewrite URLs to etc. during 
critical deployment phases or when performing other maintenance.  In this case, 
I don't want it to render pages (usually to replace otherwise nasty-looking 500 
error pages with a nice html facade).
For normal operations, I want the machine to be up.  But if it's not 
intentionally placed offline or in maintenance and the machines fail 
heartbeat checks, then the machine is not working and should not be served 
requests.
Does this make sense?
 
   I have separate health checks
  for each condition and I have been trying to use ACLs to be able to switch
  between backends.  In addition to the fact that this doesn't seem to work,
  I'm also not loving having to repeat the server lists (which are the same)
  for each backend.
 
 Nothing weird here, this is how HAProxy configuration works.
Cool, but variables would be nice to save time and avoid potential 
inconsistencies between sections.
  -- I think it's more like if any of
  these succeed, mark this server online -- and that's what's making this
  scenario complex.
 
 euh I might misunderstanding something.
 There is nothing more simple that if the health check is successful,
 then the server is considered healthy...

Since it's not strictly binary, as described above, it's a bit more complex.

  frontend staging 0.0.0.0:8080
# if the number of servers *not marked offline* is *less than the total
  number of app servers* (in this case, 2), then it is considered degraded
acl degraded nbsrv(only_online) lt 2
 
 
 This will match 0 and 1
 
# if the number of servers *not marked offline* is *less than one*, the
  site is considered down
acl down nbsrv(only_online) lt 1
 
 
 This will match 0, so you're both down and degraded ACL covers the
 same value (0).
 Which may lead to an issue later
 
# if the number of servers without the maintenance page is *less than the
  total number of app servers* (in this case, 2), then it is
  considered maintenance mode
acl mx_mode nbsrv(maintenance) lt 2
 
# if the number of servers without the maintenance page is less than 1,
  we're down because everything is in maintenance mode
acl down_mx nbsrv(maintenance) lt 1
 
 
 Same remark as above.
 
 
# if not running at full potential, use the backend that identified the
  degraded state
use_backend only_online if degraded
use_backend maintenance if mx_mode
 
# if we are down for any reason, use the backend that identified that fact
use_backend backup_only if down
use_backend backup_only if down_mx
 
 
 Here is the problem (see above).
 The 2 use_backend above will NEVER match, because the degraded ad
 mx_mode ACL overlaps their values!

Why would they never match?  Aren't you saying they *both* should match and 
wouldn't it then take action on the final match and switch the backend to 
maintenance mode?  That's what I want.  Maintenance mode overrides offline mode 
as a failsafe (since it's more restrictive) to prevent page rendering.
 Do you know the disable-on-404 option?
 it may help you make your configuration in the right way (not
 considering a 404 as a healthy response).
 

Yes, but what I actually would need is enable-on-404 :)
Thanks for your feedback!  I'm definitely open to other options, but I'm hoping 
to not have to lose the flexibility described above!
-chaz

  

Re: FW: haproxy conditional healthchecks/failover

2012-05-29 Thread Willy Tarreau
On Tue, May 29, 2012 at 08:32:29PM +, Zulu Chas wrote:
 
 am I wildly off course or is this config salvageable?
 

To be honnest, your mail with overly long lines (half a kilobyte) is painful
to read, and once I made the effort of reading it, I didn't understand why
you're trying to cross-dress something which already exists and works.
 
The disable-on-404 is made to permit enabling/disabling a server by a simple
touch or rm. It appears that you want to exactly swap these two commands,
it really makes no sense to me to modify haproxy to support such a swap in a
script.

Another reason for disabling on 404 is that it will not accidently enable a
server which was started from an unmounted docroot file system. With your
method, it would still start it.

Also, the suggested way of dealing with very specific health checks is to
write a CGI or servlet to handle the various situations. Most people are
already doing this, and if you absolutely want to use rm to start the
server and touch to stop it, then 5 lines of shell in a CGI will do it.

Regards,
Willy