Re: [PATCH] [RFC] Decrease server health based on http responses / events

Krzysztof Olędzki Wed, 09 Dec 2009 04:25:01 -0800

On 2009-12-08 22:37, Willy Tarreau wrote:

Hi Krzysztof,

Hi Willy,

it was fortunate that you reminded me about this mail because
I had lost it in the middle of a few others.

No problem. Like I told you in the private mail - it is doubtful I wouldhave been able to get to it earlier, anyway.

On Sun, Oct 25, 2009 at 01:35:35AM +0200, Krzysztof Piotr Oledzki wrote:

Subject: [RFC] Decrease server health based on http responses / events

This RFC quality patch implements decreasing server health based on
observing communication between HAProxy and servers.

I have had a working patch for this for a long time, however I needed to
rewrite nearly everything to remove hardcoded values, add more modes and
to port it into 1.4. So after the rework there is nearly nothing left from
the old code. :| In the current status the code is expected to work but it
definitely needs more testing.

OK.

BTW: I'm not very happy with names of both functions and parameters,
If you have a better idea please don't hesitate to propose it. ;)


well, first, s/halth/health/g :-)


Eh. ;) I feel ashamed.

TODO: documentation, comments, pure tcp support.


Indeed, comments at least on the enums and struct members would make
the review *much* easier. Even small abbreviated ones :-/

I assumed that everithing should be clear but it seems it is true onlyfor the author of this code. :| Sorry for that.

diff --git a/include/common/defaults.h b/include/common/defaults.h
index b0aee86..ae2f65c 100644
--- a/include/common/defaults.h
+++ b/include/common/defaults.h
@@ -120,6 +120,9 @@
 #define DEF_CHECK_REQ   "OPTIONS / HTTP/1.0\r\n\r\n"
 #define DEF_SMTP_CHECK_REQ   "HELO localhost\r\n"

+#define DEF_HANA_ONERR HANA_ONERR_FAILCHK

+#define DEF_CELIMIT    10
+
 // X-Forwarded-For header default
 #define DEF_XFORWARDFOR_HDR    "X-Forwarded-For"

diff --git a/include/proto/checks.h b/include/proto/checks.h

index bd70164..2d16976 100644
--- a/include/proto/checks.h
+++ b/include/proto/checks.h
@@ -29,6 +29,7 @@ const char *get_check_status_description(short check_status);
 const char *get_check_status_info(short check_status);
 struct task *process_chk(struct task *t);
 int start_checks();
+void halth_analyze(struct server *s, short status);

#endif /* _PROTO_CHECKS_H */diff --git a/include/types/checks.h b/include/types/checks.h

index 1b04608..3690aa5 100644
--- a/include/types/checks.h
+++ b/include/types/checks.h
@@ -18,6 +18,9 @@ enum {

/* Below we have finished checks */

        HCHK_STATUS_CHECKED,            /* DUMMY STATUS */
+
+       HCHK_STATUS_HANA,               /* Detected enough consecutive errors */
+
        HCHK_STATUS_SOCKERR,            /* Socket error */

HCHK_STATUS_L4OK, /* L4 check passed, for example tcp connect */

@@ -41,6 +44,39 @@ enum {
        HCHK_STATUS_SIZE
 };

+enum {

+       HANA_UNKNOWN    = 0,
+
+       HANA_TCP_OK,
+
+       HANA_HTTP_OK,
+       HANA_HTTP_STS,
+       HANA_HTTP_HDRRSP,
+       HANA_HTTP_RSP,
+
+       HANA_READ_ERROR,
+       HANA_READ_TIMEOUT,
+       HANA_BROKEN_PIPE,
+
+       HANA_SIZE
+};
+
+enum {
+       HANA_ONERR_UNKNOWN      = 0,
+
+       HANA_ONERR_FASTINTER,
+       HANA_ONERR_FAILCHK,
+       HANA_ONERR_SUDDTH,
+       HANA_ONERR_MARKDWN,


After reading the code, I'm not sure I got the difference between
the last two modes (sudden death and mark down).


There are four modes:

 - fastinter: force fastinter

 - failchk: simlate a failed check -> force fastinter

 - suddth (sudden death): simulate a pre-fatal failed health check, one
more failed check will marke a server down

 - markdwn: mark a server down, immediately

+};
+
+enum {
+       HANA_OBS_NONE           = 0,
+
+       HANA_OBS_EVENTS,
+       HANA_OBS_HTTP_RSPS,
+};
+
 struct check_status {
        short result;                   /* one of SRV_CHK_* */
        char *info;                     /* human readable short info */
diff --git a/include/types/server.h b/include/types/server.h
index b3fe83d..b163190 100644
--- a/include/types/server.h
+++ b/include/types/server.h
@@ -115,7 +115,10 @@ struct server {
        struct sockaddr_in check_addr;          /* the address to check, if different 
from <addr> */
        short check_port;                       /* the port to use for the 
health checks */
        int health;                             /* 0->rise-1 = bad; 
rise->rise+fall-1 = good */
+       int consecutive_errors;                 /* */
        int rise, fall;                         /* time in iterations */
+       int consecutive_errors_limit;           /* */
+       short observe, onerror;                 /* */
        int inter, fastinter, downinter;        /* checks: time in milliseconds 
*/
        int slowstart;                          /* slowstart time in seconds 
(ms in the conf) */
        int result;                             /* health-check result : 
SRV_CHK_* */
@@ -137,7 +140,7 @@ struct server {
        unsigned down_time;                     /* total time the server was 
down */
        time_t last_change;                     /* last time, when the state 
was changed */
        struct timeval check_start;             /* last health check start time 
*/
-       unsigned long check_duration;           /* time in ms took to finish 
last health check */
+       long check_duration;                    /* time in ms took to finish 
last health check */
        short check_status, check_code;         /* check result, check code */
        char check_desc[HCHK_DESC_LEN];         /* healt check descritpion */

diff --git a/src/cfgparse.c b/src/cfgparse.c

index 428d7b9..889fc74 100644
--- a/src/cfgparse.c
+++ b/src/cfgparse.c
@@ -2541,6 +2541,8 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
                newsrv->uweight = newsrv->iweight = 1;
                newsrv->maxqueue = 0;
                newsrv->slowstart = 0;
+               newsrv->onerror = DEF_HANA_ONERR;
+               newsrv->consecutive_errors_limit = DEF_CELIMIT;

cur_arg = 3;

                while (*args[cur_arg]) {
@@ -2746,6 +2748,66 @@ int cfg_parse_listen(const char *file, int linenum, char 
**args, int kwm)
                                do_check = 1;
                                cur_arg += 1;
                        }
+                       else if (!strcmp(args[cur_arg], "observe")) {
+                               if (!strcmp(args[cur_arg + 1], "none"))
+                                       newsrv->observe = HANA_OBS_NONE;
+                               else if (!strcmp(args[cur_arg + 1], "events"))
+                                       newsrv->observe = HANA_OBS_EVENTS;


What are those "events" supposed to check for ? I've not found them anywhere 
else.

Ineed. It is for pure TCP, where we can only track timeouts, resets,etc. However, I'm looking for a good place where to attach those checks,and for a better name - maybe "l3events"?

+                               else if (!strcmp(args[cur_arg + 1], 
"httprsps")) {


Please avoid compacted names like this. I'm trying to get rid of all of them
in the config because it's impossible for people to remember what letters
were omitted. Among the suggestions which come to mind : "http", "response",
"headers", "http-response", ... I know it sometimes makes longer lines but
not everyone is used to run "strings" on a binary to find acceptable keywords,
nor to wait for parsing errors in order to fix the names :-)


Sure. I'm going to choose "http-response".

If you need composed words, please use the minus sign as a separator.

OK.

+                                       if (curproxy->mode != PR_MODE_HTTP) {
+                                               Alert("parsing [%s:%d]: '%s' expects 
one of 'none', "
+                                                       "'events', 'httprsps' but 
get '%s'\n",
+                                                       file, linenum, 
args[cur_arg], args[cur_arg + 1]);
+                                               err_code |= ERR_ALERT;
+                                       }
+                                       newsrv->observe = HANA_OBS_HTTP_RSPS;
+                               }
+                               else {
+                                       Alert("parsing [%s:%d]: '%s' expects one of 
'none', "
+                                               "'events', 'httprsps' but get 
'%s'\n",
+                                               file, linenum, args[cur_arg], 
args[cur_arg + 1]);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+
+                               cur_arg += 2;
+                       }
+                       else if (!strcmp(args[cur_arg], "onerror")) {


is "on-error" OK for you ?


Ack.

+                               if (!strcmp(args[cur_arg + 1], "fastinter"))
+                                       newsrv->onerror = HANA_ONERR_FASTINTER;
+                               else if (!strcmp(args[cur_arg + 1], 
"failcheck"))
+                                       newsrv->onerror = HANA_ONERR_FAILCHK;
+                               else if (!strcmp(args[cur_arg + 1], 
"suddendeath"))
+                                       newsrv->onerror = HANA_ONERR_SUDDTH;
+                               else if (!strcmp(args[cur_arg + 1], "markdown"))
+                                       newsrv->onerror = HANA_ONERR_MARKDWN;
+                               else {
+                                       Alert("parsing [%s:%d]: '%s' expects one of 
'fastinter', "
+                                               "'failcheck', 'suddendeath' or 
'godown' but get '%s'\n",


Warning, you wrote "godown" here and "markdown" above. Maybe simply "down" 
would make it.


Ouch.

+                                               file, linenum, args[cur_arg], 
args[cur_arg + 1]);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+
+                               cur_arg += 2;
+                       }
+                       else if (!strcmp(args[cur_arg], "allowed_error")) {


maybe "error-limit" ?


Indeed, much better.

+                               if (!*args[cur_arg + 1]) {
+                                       Alert("parsing [%s:%d]: '%s' expects an 
integer argument.\n",
+                                               file, linenum, args[cur_arg]);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+
+                               newsrv->consecutive_errors_limit = 
atoi(args[cur_arg + 1]);
+
+                               if (newsrv->consecutive_errors_limit <= 0) {
+                                       Alert("parsing [%s:%d]: %s has to be > 
0.\n",
+                                               file, linenum, args[cur_arg]);
+                                       err_code |= ERR_ALERT | ERR_FATAL;
+                                       goto out;
+                               }
+                       }


I'm just wondering about those options. Are we supposed to use only some of them
without the other ones ?


Yes, because each option has its default value, so both:
 observe http-response onerror failcheck errors-limit 10
and
 observe http-response
are identical - after 10 consecutive errors haproxy simulates a filed check.

I mean, maybe we could have sort of an error-react prefix
with its few parameters afterwards. Maybe something in that spirit :

   error-react to <event> by <action> after <limit>

It's just an idea, not necessarily something to follow.


Something like "error-react to http-response by failcheck after 10"?

                        else if (!strcmp(args[cur_arg], "source")) {  /* 
address to which we bind when connecting */
                                int port_low, port_high;
                                if (!*args[cur_arg + 1]) {
diff --git a/src/checks.c b/src/checks.c
index 0e9aca5..61bac4c 100644
--- a/src/checks.c
+++ b/src/checks.c
@@ -52,6 +52,8 @@ const struct check_status check_statuses[HCHK_STATUS_SIZE] = {
        [HCHK_STATUS_INI]       = { SRV_CHK_UNKNOWN,                   "INI",     
"Initializing" },
        [HCHK_STATUS_START]     = { /* SPECIAL STATUS*/ },

+ [HCHK_STATUS_HANA] = { SRV_CHK_ERROR, "HANA", "Health analyze" },

+
        [HCHK_STATUS_SOCKERR]   = { SRV_CHK_ERROR,                     "SOCKERR", 
"Socket error" },

[HCHK_STATUS_L4OK] = { SRV_CHK_RUNNING, "L4OK", "Layer4 check passed" },

@@ -72,6 +74,21 @@ const struct check_status check_statuses[HCHK_STATUS_SIZE] = 
{
        [HCHK_STATUS_L7STS]     = { SRV_CHK_ERROR,                     "L7STS",   
"Layer7 wrong status" },
 };

+const char *analyze_statuses[HANA_SIZE] = {

+       [HANA_UNKNOWN]          = "Unknown",
+
+       [HANA_TCP_OK]           = "Correct tcp response",
+
+       [HANA_HTTP_OK]          = "Correct http response",
+       [HANA_HTTP_STS]         = "Wrong http response",
+       [HANA_HTTP_HDRRSP]      = "Invalid http response (headers)",
+       [HANA_HTTP_RSP]         = "Invalid http response",
+
+       [HANA_READ_ERROR]       = "Read error",
+       [HANA_READ_TIMEOUT]     = "Read timeout",
+       [HANA_BROKEN_PIPE]      = "Close from server",
+};
+
 /*
  * Convert check_status code to description
  */
@@ -108,6 +125,21 @@ const char *get_check_status_info(short check_status) {
                return check_statuses[HCHK_STATUS_UNKNOWN].info;
 }

+const char *get_analyze_status(short analyze_status) {

+
+       const char *desc;
+
+       if (analyze_status < HANA_SIZE)
+               desc = analyze_statuses[analyze_status];
+       else
+               desc = NULL;
+
+       if (desc && *desc)
+               return desc;
+       else
+               return analyze_statuses[HANA_UNKNOWN];
+}
+
 #define SSP_O_VIA      0x0001
 #define SSP_O_HCHK     0x0002
 #define SSP_O_STATUS   0x0004
@@ -136,7 +168,8 @@ static void server_status_printf(struct chunk *msg, struct 
server *s, unsigned o
                        chunk_printf(msg, "\"");
                }

- chunk_printf(msg, ", check duration: %lums", s->check_duration);

+               if (s->check_duration >= 0)
+                       chunk_printf(msg, ", check duration: %ldms", 
s->check_duration);
        }

if (options & SSP_O_STATUS) {

@@ -184,13 +217,14 @@ static void set_server_check_status(struct server *s, 
short status, char *desc)

s->check_status = status;

        if (check_statuses[status].result)
-               s->result |= check_statuses[status].result;
+               s->result = check_statuses[status].result;

        if (!tv_iszero(&s->check_start)) {
                /* set_server_check_status() may be called more than once */
                s->check_duration = tv_ms_elapsed(&s->check_start, &now);
                tv_zero(&s->check_start);
-       }
+       } else
+               s->check_duration = -1;

Are these three isolated changes on purpose, are they a mistake, or are they a

fix for something we want to backport or at least merge separately ? I'm asking
because at first glance it seems imbalanced with other changes. There is a
fourth one further about check_duration.

These three isolated changes are on purpose, bacuase now we can callserver_status_printf() when we simulate a failed halth check. Beforethat we used to first start a check and set s->check_start and s->resultand then we called server_status_printf(), but it is no longer true.

        if (s->proxy->options2 & PR_O2_LOGHCHKS &&
        (((s->health != 0) && (s->result & SRV_CHK_ERROR)) ||
@@ -229,6 +263,10 @@ static void set_server_check_status(struct server *s, 
short status, char *desc)
                                if (health >= rise)
                                        health = rise + fall - 1; /* OK now */
                        }
+
+                       /* clear consecutive_errors if observing is enabled */
+                       if (s->onerror)
+                               s->consecutive_errors = 0;
                }
                /* FIXME end: calculate local version of the 
health/rise/fall/state */

@@ -505,6 +543,102 @@ static void set_server_enabled(struct server *s) {

                        set_server_enabled(srv);
 }

+void halth_analyze(struct server *s, short status) {


Maybe we should call it "health_adjust" or "health_recalc" or "health_compute" ?


Sure, "health_adjust" look fine to me.

+
+       int failed = -1;
+       int expire;
+
+       /* return if observing or healt check is not enabled */
+       if (!s->observe || !s->check)
+               return;
+
+       switch (status) {
+               case HANA_TCP_OK:
+               case HANA_HTTP_OK:
+                       failed = 0;
+                       break;
+
+               case HANA_HTTP_STS:
+                       if (s->observe == HANA_OBS_HTTP_RSPS)
+                               failed = 1;
+                       else
+                               failed = 0;
+                       break;
+
+               case HANA_HTTP_HDRRSP:
+               case HANA_HTTP_RSP:
+               case HANA_READ_ERROR:
+               case HANA_READ_TIMEOUT:
+               case HANA_BROKEN_PIPE:
+                       failed = 1;
+
+               default:
+                       break;
+       }
+
+       /* unknown status, ignore it */
+       if (failed < 0)
+               return;
+
+       if (!failed) {
+               /* good, clear consecutive_errors */
+               s->consecutive_errors = 0;
+               return;
+       }
+
+       s->consecutive_errors++;
+
+       if (s->consecutive_errors < s->consecutive_errors_limit)
+               return;
+
+       sprintf(trash, "Detected %d consecutive errors, last one was: %s",
+               s->consecutive_errors, get_analyze_status(status));
+
+       switch (s->onerror) {
+               case HANA_ONERR_FASTINTER:
+               /* force fastinter - nothing to do here as all modes force it */


Does this mean that we're now forced to at least switch to fast
inter in case of error or can we still use the current behaviour ?


Yes, by simply not enabling the functionality. By default it is not enabled.

+                       break;
+
+               case HANA_ONERR_SUDDTH:
+               /* simulate a pre-fatal failed health check */
+                       if (s->health > s->rise)
+                               s->health = s->rise + 1;
+
+                       /* no break - fall through */
+
+               case HANA_ONERR_FAILCHK:
+               /* simulate a failed health check */
+                       set_server_check_status(s, HCHK_STATUS_HANA, trash);
+
+                       if (s->health > s->rise) {
+                               s->health--; /* still good */
+                               s->counters.failed_checks++;
+                       }
+                       else
+                               set_server_down(s);
+
+                       break;
+
+               case HANA_ONERR_MARKDWN:
+               /* mark server down */
+                       s->health = s->rise;
+                       set_server_check_status(s, HCHK_STATUS_HANA, trash);
+                       set_server_down(s);
+
+                       break;
+
+               default:
+                       /* write warning? */
+                       break;
+       }
+
+       s->consecutive_errors = 0;
+
+       expire = tick_add(now_ms, MS_TO_TICKS(s->fastinter));
+       if (s->check->expire > expire)
+               s->check->expire = expire;
+}
+
 /*
  * This function is used only for server health-checks. It handles
  * the connection acknowledgement. If the proxy requires L7 health-checks,
diff --git a/src/dumpstats.c b/src/dumpstats.c
index 522c3f8..d23602a 100644
--- a/src/dumpstats.c
+++ b/src/dumpstats.c
@@ -1545,7 +1545,7 @@ int stats_dump_proxy(struct session *s, struct proxy *px, 
struct uri_auth *uri)
                                        if (sv->check_status >= 
HCHK_STATUS_L57DATA)
                                                chunk_printf(&msg, "/%d", 
sv->check_code);

- if (sv->check_status >= HCHK_STATUS_CHECKED)

+                                       if (sv->check_status >= HCHK_STATUS_CHECKED 
&& sv->check_duration >= 0)


it's the last one I was talking about.


It is required only now.

                                                chunk_printf(&msg, " in %lums", 
sv->check_duration);
                                } else {
                                        chunk_printf(&msg, "</td><td>");
diff --git a/src/proto_http.c b/src/proto_http.c
index 2300738..4d8f09d 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -41,6 +41,7 @@
 #include <proto/acl.h>
 #include <proto/backend.h>
 #include <proto/buffers.h>
+#include <proto/checks.h>
 #include <proto/client.h>
 #include <proto/dumpstats.h>
 #include <proto/fd.h>
@@ -2857,8 +2858,10 @@ int process_response(struct session *t)

buffer_shutr_now(rep);

                                buffer_shutw_now(req);
-                               if (t->srv)
+                               if (t->srv) {
                                        t->srv->counters.failed_resp++;
+                                       halth_analyze(t->srv, HANA_HTTP_HDRRSP);
+                               }
                                t->be->counters.failed_resp++;
                                rep->analysers = 0;
                                txn->status = 502;
@@ -2880,8 +2883,10 @@ int process_response(struct session *t)
                                        
http_capture_bad_message(&t->be->invalid_rep, t, rep, msg, t->fe);
                                buffer_shutr_now(rep);
                                buffer_shutw_now(req);
-                               if (t->srv)
+                               if (t->srv) {
                                        t->srv->counters.failed_resp++;
+                                       halth_analyze(t->srv, HANA_READ_ERROR);
+                               }
                                t->be->counters.failed_resp++;
                                rep->analysers = 0;
                                txn->status = 502;
@@ -2898,8 +2903,10 @@ int process_response(struct session *t)
                                        
http_capture_bad_message(&t->be->invalid_rep, t, rep, msg, t->fe);
                                buffer_shutr_now(rep);
                                buffer_shutw_now(req);
-                               if (t->srv)
+                               if (t->srv) {
                                        t->srv->counters.failed_resp++;
+                                       halth_analyze(t->srv, 
HANA_READ_TIMEOUT);
+                               }
                                t->be->counters.failed_resp++;
                                rep->analysers = 0;
                                txn->status = 504;
@@ -2915,8 +2922,10 @@ int process_response(struct session *t)
                                if (msg->err_pos >= 0)
                                        
http_capture_bad_message(&t->be->invalid_rep, t, rep, msg, t->fe);
                                buffer_shutw_now(req);
-                               if (t->srv)
+                               if (t->srv) {
                                        t->srv->counters.failed_resp++;
+                                       halth_analyze(t->srv, HANA_BROKEN_PIPE);
+                               }
                                t->be->counters.failed_resp++;
                                rep->analysers = 0;
                                txn->status = 502;
@@ -2971,6 +2980,20 @@ int process_response(struct session *t)
                if (n < 1 || n > 5)
                        n = 0;

+ switch(n) {

+                       case 1:
+                       case 2:
+                       case 3:
+                       case 4:
+                               halth_analyze(t->srv, HANA_HTTP_OK);
+                               break;
+
+                       case 5:
+                       default:
+                               halth_analyze(t->srv, HANA_HTTP_STS);
+                               break;
+               }
+
                t->srv->counters.p.http.rsp[n]++;
                t->be->counters.p.http.rsp[n]++;

@@ -3027,8 +3050,10 @@ int process_response(struct session *t)

                        if (rule_set->rsp_exp != NULL) {
                                if (apply_filters_to_response(t, rep, 
rule_set->rsp_exp) < 0) {
                                return_bad_resp:
-                                       if (t->srv)
+                                       if (t->srv) {
                                                t->srv->counters.failed_resp++;
+                                               halth_analyze(t->srv, 
HANA_HTTP_RSP);
+                                       }
                                        cur_proxy->counters.failed_resp++;
                                return_srv_prx_502:
                                        buffer_shutr_now(rep);


OK overall it's pretty good in my opinion. It looks like you're
trying to handle a large number of possibilities, which is fine

because everyone has different

I'm afraid that after so long without a review, it might take you
some time to get back into the code. I'm interesting in merging it
early so that users can challenge it with their unpredictable
imagination. I'm confident it should satisfy most usages. At worst

we might discover corner cases, but that's not dramatic, we'restill in development.

Thanks, especially for your careful review. I'm going to address yourremarks and release a new patch. I hope I'll be able to do it today. :)


Best regards,

                        Krzysztof Olędzki

Re: [PATCH] [RFC] Decrease server health based on http responses / events

Reply via email to