Allow an agent check to be run in conjunction with one other server
health check.

If the backend for a server check is not lb-agent-chk then an agent
check may also be run using the agent-check parameter to a server,
which sets the TCP port to be used for the agent check.

e.g.
server  web1_1 127.0.0.1:80 check agent-port 10000

The agent-inter parameter may also be used to specify the interval
and timeout for agent checks.

If either the health or agent check determines that a server is down
then it is marked as being down, otherwise it is marked as being up.

Signed-off-by: Simon Horman <ho...@verge.net.au>

---

v4
*  Increment global.maxsock for agent-port.

   If agent-port is configured then an extra socket is required.

* Do not send requests to secondary agent checks

  The request configuration of a proxy relates to the primary health
  check and should not be sent to the secondary health check if it
  is in operation.

* Correct usage of PR_O2_LB_AGENT_CHK

  The correct way to check for PR_O2_LB_AGENT_CHK is
  not to use x & PR_O2_LB_AGENT_CHK, but rather to use
  (x & PR_O2_CHK_ANY ) == PR_O2_LB_AGENT_CHK.

v2 - v3
* No change
---
 doc/configuration.txt  |   61 +++++++++++++++++++++++++++++++-------
 include/types/server.h |    1 +
 src/cfgparse.c         |   76 ++++++++++++++++++++++++++++++++++++++++++++++--
 src/checks.c           |   20 ++++++++++---
 src/haproxy.c          |    6 ++++
 5 files changed, 147 insertions(+), 17 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 5297cfd..acb356d 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -777,11 +777,12 @@ nosplice
   "option splice-response".
 
 spread-checks <0..50, in percent>
-  Sometimes it is desirable to avoid sending health checks to servers at exact
-  intervals, for instance when many logical servers are located on the same
-  physical server. With the help of this parameter, it becomes possible to add
-  some randomness in the check interval between 0 and +/- 50%. A value between
-  2 and 5 seems to show good results. The default value remains at 0.
+  Sometimes it is desirable to avoid sending agent and health checks to
+  servers at exact intervals, for instance when many logical servers are
+  located on the same physical server. With the help of this parameter, it
+  becomes possible to add some randomness in the check interval between 0
+  and +/- 50%. A value between 2 and 5 seems to show good results. The
+  default value remains at 0.
 
 tune.bufsize <number>
   Sets the buffer size to this size (in bytes). Lower values allow more
@@ -3949,6 +3950,9 @@ option lb-agent-chk
   information described above. Otherwise the port of the serivce will be
   used.
 
+  This option conflicts with the agent-port parameter to a server.
+
+
 option ldap-check
   Use LDAPv3 health checks for server testing
   May be used in sections :   defaults | frontend | listen | backend
@@ -7648,6 +7652,43 @@ addr <ipv4|ipv6>
 
   Supported in default-server: No
 
+agent-inter <delay>
+  The "agent-inter" parameter sets the interval between two agent checks
+  to <delay> milliseconds. If left unspecified, the delay defaults to 2000 ms.
+
+  Just as with every other time-based parameter, it may be entered in any
+  other explicit unit among { us, ms, s, m, h, d }. The "agent-inter"
+  parameter also serves as a timeout for agent checks "timeout check" is
+  not set. In order to reduce "resonance" effects when multiple servers are
+  hosted on the same hardware, the agent and health checks of all servers
+  are started with a small time offset between them. It is also possible to
+  add some random noise in the agent and health checks interval using the
+  global "spread-checks" keyword. This makes sense for instance when a lot
+  of backends use the same servers.
+
+  Requires the "agent-port" parameter to be set.
+
+  Supported in default-server: Yes
+
+agent-port <port>
+  Using the "agent-port" parameter, it becomes possible to run an agent
+  check in conjunction with a regular health check. In this scenario the
+  "agent-port" parameter specifies the TCP that an agent check should
+  connect to. Typically this is different to the port of the service and
+  health-check.
+
+  For a description agent checks and the protocol used by them see the
+  description of "lb-agent-check".
+
+  In this the agent check is run in conjunction with another check
+  and as such the check backend should be set to some value other than
+  "lb-agent-check". An alternative scenario is to run only an agent check
+  in which case the check backend should be set to "lb-agent-check" and
+  "agent-port" should not be set; in that scenario the port may be set
+  using the "port" parameter.
+
+  See also the "agent-inter" parameter.
+
 backup
   When "backup" is present on a server line, the server is only used in load
   balancing when all other non-backup servers are unavailable. Requests coming
@@ -7824,11 +7865,11 @@ downinter <delay>
   other explicit unit among { us, ms, s, m, h, d }. The "inter" parameter also
   serves as a timeout for health checks sent to servers if "timeout check" is
   not set. In order to reduce "resonance" effects when multiple servers are
-  hosted on the same hardware, the health-checks of all servers are started
-  with a small time offset between them. It is also possible to add some random
-  noise in the health checks interval using the global "spread-checks"
-  keyword. This makes sense for instance when a lot of backends use the same
-  servers.
+  hosted on the same hardware, the agent and health checks of all servers
+  are started with a small time offset between them. It is also possible to
+  add some random noise in the agent and health checks interval using the
+  global "spread-checks" keyword. This makes sense for instance when a lot
+  of backends use the same servers.
 
   Supported in default-server: Yes
 
diff --git a/include/types/server.h b/include/types/server.h
index 3a0281b..1eec648 100644
--- a/include/types/server.h
+++ b/include/types/server.h
@@ -191,6 +191,7 @@ struct server {
        } check_common;
 
        struct check check;                     /* health-check specific 
configuration */
+       struct check agent;                     /* agent specific configuration 
*/
 
 #ifdef USE_OPENSSL
        int use_ssl;                            /* ssl enabled */
diff --git a/src/cfgparse.c b/src/cfgparse.c
index 2a70a17..ab309cf 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -1325,9 +1325,13 @@ void init_default_instance()
        defproxy.defsrv.check.inter = DEF_CHKINTR;
        defproxy.defsrv.check.fastinter = 0;
        defproxy.defsrv.check.downinter = 0;
+       defproxy.defsrv.agent.inter = DEF_CHKINTR;
+       defproxy.defsrv.agent.fastinter = 0;
+       defproxy.defsrv.agent.downinter = 0;
        defproxy.defsrv.rise = DEF_RISETIME;
        defproxy.defsrv.fall = DEF_FALLTIME;
        defproxy.defsrv.check.port = 0;
+       defproxy.defsrv.agent.port = 0;
        defproxy.defsrv.maxqueue = 0;
        defproxy.defsrv.minconn = 0;
        defproxy.defsrv.maxconn = 0;
@@ -4135,7 +4139,7 @@ stats_error_parsing:
        else if (!strcmp(args[0], "server") || !strcmp(args[0], 
"default-server")) {  /* server address */
                int cur_arg;
                short realport = 0;
-               int do_check = 0, defsrv = (*args[0] == 'd');
+               int do_agent = 0, do_check = 0, defsrv = (*args[0] == 'd');
 
                if (!defsrv && curproxy == &defproxy) {
                        Alert("parsing [%s:%d] : '%s' not allowed in 'defaults' 
section.\n", file, linenum, args[0]);
@@ -4182,6 +4186,7 @@ stats_error_parsing:
                        LIST_INIT(&newsrv->actconns);
                        LIST_INIT(&newsrv->pendconns);
                        do_check = 0;
+                       do_agent = 0;
                        newsrv->state = SRV_RUNNING; /* early server setup */
                        newsrv->last_change = now.tv_sec;
                        newsrv->id = strdup(args[1]);
@@ -4235,11 +4240,16 @@ stats_error_parsing:
                                goto out;
                        }
 
-                       newsrv->check.use_ssl = curproxy->defsrv.check.use_ssl;
+                       newsrv->check.use_ssl   = 
curproxy->defsrv.check.use_ssl;
                        newsrv->check.port      = curproxy->defsrv.check.port;
                        newsrv->check.inter     = curproxy->defsrv.check.inter;
                        newsrv->check.fastinter = 
curproxy->defsrv.check.fastinter;
                        newsrv->check.downinter = 
curproxy->defsrv.check.downinter;
+                       newsrv->agent.use_ssl   = 
curproxy->defsrv.agent.use_ssl;
+                       newsrv->agent.port      = curproxy->defsrv.agent.port;
+                       newsrv->agent.inter     = curproxy->defsrv.agent.inter;
+                       newsrv->agent.fastinter = 
curproxy->defsrv.agent.fastinter;
+                       newsrv->agent.downinter = 
curproxy->defsrv.agent.downinter;
                        newsrv->rise            = curproxy->defsrv.rise;
                        newsrv->fall            = curproxy->defsrv.fall;
                        newsrv->maxqueue        = curproxy->defsrv.maxqueue;
@@ -4261,6 +4271,12 @@ stats_error_parsing:
                        newsrv->check.name      = "Health";
                        newsrv->check.server    = newsrv;
 
+                       newsrv->agent.status    = HCHK_STATUS_INI;
+                       newsrv->agent.health    = newsrv->rise; /* up, but will 
fall down at first failure */
+                       newsrv->agent.type      = PR_O2_LB_AGENT_CHK;
+                       newsrv->agent.name      = "Agent";
+                       newsrv->agent.server    = newsrv;
+
                        cur_arg = 3;
                } else {
                        newsrv = &curproxy->defsrv;
@@ -4268,7 +4284,30 @@ stats_error_parsing:
                }
 
                while (*args[cur_arg]) {
-                       if (!defsrv && !strcmp(args[cur_arg], "cookie")) {
+                       if (!strcmp(args[cur_arg], "agent-inter")) {
+                               const char *err = parse_time_err(args[cur_arg + 
1], &val, TIME_UNIT_MS);
+                               if (err) {
+                                       Alert("parsing [%s:%d] : unexpected 
character '%c' in 'agent-inter' argument of server %s.\n",
+                                             file, linenum, *err, newsrv->id);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+                               if (val <= 0) {
+                                       Alert("parsing [%s:%d]: invalid value 
%d for argument '%s' of server %s.\n",
+                                             file, linenum, val, 
args[cur_arg], newsrv->id);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+                               newsrv->agent.inter = val;
+                               cur_arg += 2;
+                       }
+                       else if (!strcmp(args[cur_arg], "agent-port")) {
+                               global.maxsock++;
+                               newsrv->agent.port = atol(args[cur_arg + 1]);
+                               do_agent = 1;
+                               cur_arg += 2;
+                       }
+                       else if (!defsrv && !strcmp(args[cur_arg], "cookie")) {
                                newsrv->cookie = strdup(args[cur_arg + 1]);
                                newsrv->cklen = strlen(args[cur_arg + 1]);
                                cur_arg += 2;
@@ -4296,6 +4335,8 @@ stats_error_parsing:
 
                                if (newsrv->check.health)
                                        newsrv->check.health = newsrv->rise;
+                               if (newsrv->agent.health)
+                                       newsrv->agent.health = newsrv->rise;
                                cur_arg += 2;
                        }
                        else if (!strcmp(args[cur_arg], "fall")) {
@@ -4477,6 +4518,7 @@ stats_error_parsing:
                                newsrv->state |= SRV_MAINTAIN;
                                newsrv->state &= ~SRV_RUNNING;
                                newsrv->check.health = 0;
+                               newsrv->agent.health = 0;
                                cur_arg += 1;
                        }
                        else if (!defsrv && !strcmp(args[cur_arg], "observe")) {
@@ -4876,6 +4918,33 @@ stats_error_parsing:
                        newsrv->state |= SRV_CHECKED;
                }
 
+               if (do_agent && do_check) {
+                       int ret;
+
+                       if (!newsrv->agent.port) {
+                               Alert("parsing [%s:%d] : server %s has 
agent-inter without agent-port.\n",
+                                     file, linenum, newsrv->id);
+                               err_code |= ERR_ALERT | ERR_FATAL;
+                               goto out;
+                       }
+
+                       if ((newsrv->proxy->options2 & PR_O2_CHK_ANY) == 
PR_O2_LB_AGENT_CHK) {
+                               Alert("parsing [%s:%d] : server %s has 
agent-inter or agent-port but check type is lb-agent-chk.\n",
+                                     file, linenum, newsrv->id);
+                               err_code |= ERR_ALERT | ERR_FATAL;
+                               goto out;
+                       }
+
+                       if (!newsrv->agent.inter)
+                               newsrv->agent.inter = newsrv->check.inter;
+
+                       ret = init_check(newsrv, &newsrv->agent, file, linenum);
+                       if (ret) {
+                               err_code |= ret;
+                               goto out;
+                       }
+               }
+
                if (!defsrv) {
                        if (newsrv->state & SRV_BACKUP)
                                curproxy->srv_bck++;
@@ -6765,6 +6834,7 @@ out_uri_auth_compat:
                                        newsrv->state |= SRV_MAINTAIN;
                                        newsrv->state &= ~SRV_RUNNING;
                                        newsrv->check.health = 0;
+                                       newsrv->agent.health = 0;
                                }
 
                                newsrv->track = srv;
diff --git a/src/checks.c b/src/checks.c
index 9642666..f2b11c5 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -398,7 +398,7 @@ void set_server_down(struct check *check)
                check->health = s->rise;
        }
 
-       if (check->health == s->rise || s->track) {
+       if ((s->state & SRV_RUNNING && check->health == s->rise) || s->track) {
                int srv_was_paused = s->state & SRV_GOINGDOWN;
                int prev_srv_count = s->proxy->srv_bck + s->proxy->srv_act;
 
@@ -465,7 +465,8 @@ void set_server_up(struct check *check) {
                check->health = s->rise;
        }
 
-       if (check->health == s->rise || s->track) {
+       if ((s->check.health >= s->rise && s->agent.health >= s->rise &&
+            check->health == s->rise) || s->track) {
                if (s->proxy->srv_bck == 0 && s->proxy->srv_act == 0) {
                        if (s->proxy->last_change < now.tv_sec)         // 
ignore negative times
                                s->proxy->down_time += now.tv_sec - 
s->proxy->last_change;
@@ -1331,8 +1332,12 @@ static struct task *process_chk(struct task *t)
                check->bo->p = check->bo->data;
                check->bo->o = 0;
 
-               /* prepare the check buffer */
-               if (s->proxy->options2 & PR_O2_CHK_ANY) {
+
+              /* prepare the check buffer
+               * This should not be used if check is the secondary agent check
+               * of a server as s->proxy->check_req will relate to the
+               * configuration of the primary check */
+              if (check->type && check != &s->agent) {
                        bo_putblk(check->bo, s->proxy->check_req, 
s->proxy->check_len);
 
                        /* we want to check if this host replies to HTTP or 
SSLv3 requests
@@ -1615,8 +1620,15 @@ int start_checks() {
                        if (!(s->state & SRV_CHECKED))
                                continue;
 
+                       /* A task for the primary check */
                        if (start_check_task(&s->check, mininter, nbcheck, 
srvpos++))
                                return -1;
+
+                       /* A task for a secondary agent check */
+                       if (s->agent.type) {
+                               if (start_check_task(&s->agent, mininter, 
nbcheck, srvpos++))
+                                       return -1;
+                       }
                }
        }
        return 0;
diff --git a/src/haproxy.c b/src/haproxy.c
index ec9f513..76341a3 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -1119,6 +1119,10 @@ void deinit(void)
                                task_delete(s->check.task);
                                task_free(s->check.task);
                        }
+                       if (s->agent.task) {
+                               task_delete(s->agent.task);
+                               task_free(s->agent.task);
+                       }
 
                        if (s->warmup) {
                                task_delete(s->warmup);
@@ -1129,6 +1133,8 @@ void deinit(void)
                        free(s->cookie);
                        free(s->check.bi);
                        free(s->check.bo);
+                       free(s->agent.bi);
+                       free(s->agent.bo);
                        free(s);
                        s = s_next;
                }/* end while(s) */
-- 
1.7.10.4


Reply via email to