Hi!
I'm trying to use HAproxy to support the concepts of "offline", "in maintenance
mode", and "not working" servers. I have separate health checks for each
condition and I have been trying to use ACLs to be able to switch between
backends. In addition to the fact that this doesn't seem to work, I'm also not
loving having to repeat the server lists (which are the same) for each backend.
But perhaps I'm misunderstanding something fundamental here about how I should
be tackling this. As far as I can tell, having multiple httpchk's per backend
doesn't work in an "if any of these fail, then call mark this server offline"
-- I think it's more like "if any of these succeed, mark this server online" --
and that's what's making this scenario complex. That is, the /check can pass
but I might have marked the server offline manually or be in the process of
deploying and so /maintenance.html exists -- it's not a strictly boolean
(online/offline) issue.
Here's the setup:
global maxconn 1024 log 127.0.0.1 local0 notice spread-checks 5
daemon user haproxy
defaults log global mode http balance leastconn maxconn 500 option httplog
option abortonclose option httpclose option forwardfor retries 3 option
redispatch timeout client 1m timeout connect 30s timeout server 1m stats
enable stats uri /haproxy?stats stats auth hauser:hapasswd
monitor-uri /haproxy?monitor timeout check 10000
frontend staging 0.0.0.0:8080 # if the number of servers *not marked offline*
is *less than the total number of app servers* (in this case, 2), then it is
considered degraded acl degraded nbsrv(only_online) lt 2
# if the number of servers *not marked offline* is *less than one*, the site
is considered down acl down nbsrv(only_online) lt 1
# if the number of servers without the maintenance page is *less than the
total number of app servers* (in this case, 2), then it is considered
maintenance mode acl mx_mode nbsrv(maintenance) lt 2
# if the number of servers without the maintenance page is less than 1, we're
down because everything is in maintenance mode acl down_mx nbsrv(maintenance)
lt 1
# if not running at full potential, use the backend that identified the
degraded state use_backend only_online if degraded use_backend maintenance if
mx_mode
# if we are down for any reason, use the backend that identified that fact
use_backend backup_only if down use_backend backup_only if down_mx
# by default, use 'normal ops' default_backend normal
backend only_online # if /offline exists, the server has been intentionally
marked as offline option httpchk HEAD /offline HTTP/1.0 http-check expect
status 404 http-check send-state server App1 app1:8080 check inter 5000 rise
2 fall 2 server App2 app2:8080 check inter 5000 rise 2 fall 2
backend maintenance # if /maintenance.html exists, the server is in maintance
mode option httpchk HEAD /maintenance.html HTTP/1.0 http-check expect status
404 http-check send-state server App1 app1:8080 check inter 2000 rise 2 fall
2 server App2 app2:8080 check inter 2000 rise 2 fall 2
backend normal cookie SESSIONID insert indirect option httpchk HEAD /check
HTTP/1.0 http-check send-state server App1 app1:8080 cookie A check inter
10000 rise 2 fall 2 server App2 app2:8080 cookie B check inter 10000 rise 2
fall 2 server Backup1 app3:8080 cookie C check inter 10000 rise 2 fall 2 backup
backend backup_only option httpchk HEAD /check HTTP/1.0 http-check send-state
server Backup1 app3:8080 check inter 2000 rise 2 fall 2