Re: [PATCH] [RFC] Decrease server health based on http responses / events

Willy Tarreau Wed, 09 Dec 2009 23:40:11 -0800

Hi Krzysztof,

On Wed, Dec 09, 2009 at 01:23:57PM +0100, Krzysztof Ol?dzki wrote:
> There are four modes:
> 
>  - fastinter: force fastinter
> 
>  - failchk: simlate a failed check -> force fastinter


     OK so that one should decrease the health each time we get a
     series of errors, that's it ?
> 
>  - suddth (sudden death): simulate a pre-fatal failed health check, one
> more failed check will marke a server down
> 
>  - markdwn: mark a server down, immediately

OK, I think that covers a wide range of usage patterns.

(...)
> >What are those "events" supposed to check for ? I've not found them 
> >anywhere else.
> 
> Ineed. It is for pure TCP, where we can only track timeouts, resets, 
> etc. However, I'm looking for a good place where to attach those checks, 
> and for a better name - maybe "l3events"?

"events" by itself might not be the proper name then. For instance, a
timeout is precisely a lack of event. Maybe simply "errors" ? The
other ones are not errors, just plain valid status codes after all.

(...)
> >I'm just wondering about those options. Are we supposed to use only some 
> >of them
> >without the other ones ?
> 
> Yes, because each option has its default value, so both:
>  observe http-response onerror failcheck errors-limit 10
> and
>  observe http-response
> are identical - after 10 consecutive errors haproxy simulates a filed check.

OK.

> >I mean, maybe we could have sort of an error-react prefix
> >with its few parameters afterwards. Maybe something in that spirit :
> >
> >   error-react to <event> by <action> after <limit>
> > 
> >It's just an idea, not necessarily something to follow.
> 
> Something like "error-react to http-response by failcheck after 10"?

yes, precisely. Another advantage would be that we could also
allow the statement on regular backend config (even defaults)
when it's supposed to be the same for all servers. It would
then be handled just like the "source" keyword : per-server,
then per-backend.

(...)
> >Are these three isolated changes on purpose, are they a mistake, or are 
> >they a
> >fix for something we want to backport or at least merge separately ? I'm 
> >asking
> >because at first glance it seems imbalanced with other changes. There is a
> >fourth one further about check_duration.
> 
> These three isolated changes are on purpose, bacuase now we can call 
> server_status_printf() when we simulate a failed halth check. Before 
> that we used to first start a check and set s->check_start and s->result 
> and then we called server_status_printf(), but it is no longer true.

OK.

(...)
> >Does this mean that we're now forced to at least switch to fast
> >inter in case of error or can we still use the current behaviour ?
> 
> Yes, by simply not enabling the functionality. By default it is not enabled.

OK that's fine. You know how I'm attached to keep backwards
compatibility :-)

Regards,
Willy

Re: [PATCH] [RFC] Decrease server health based on http responses / events

Reply via email to