Hi Willy.

Am 02-01-2020 10:49, schrieb Willy Tarreau:
Hi Aleks,

On Thu, Dec 26, 2019 at 12:11:31PM +0100, Aleksandar Lazic wrote:
>   - rewrite *all* health checks to internally use tcp-check sequences
> only
>     so that we don't have to maintain this horribly broken mess anymore
> and
>     can more easily implement new ones ;

Well, I think we will need also udp-check for example for DNS, QUIC and some
other protocols.

These would then be DNS, QUIC, ping or whatever. There's no such thing as a UDP check given that by default UDP doesn't respond so there's no way to know whether a generic service works or not. You cannot even count on ICMP port unreachable which can be rate-limited or filtered out. In fact TCP was the particular case here since it's possible to at least check that a port
is bound without knowing what protocol is spoken on top of it.

Oh yes you are right. There should be specific tests for protocols as we
have for some protocols right now.

>   - implement centralized check definitions that can be reused in
> multiple
>     backends. E.g. tcp-checks sequences or htx check sequences should
> only
>     be defined once. This implies that some of the headers will likely
> need
>     to support variables so that a backend may defined a host header for
>     example and the checks use it.

But we have already such a possibility IMHO, it's the named defaults
section,
isn't it.

tcp-checks cannot be put into defaults sections. Also even with defaults
sections it makes sense to be able to define a few favorite checks that
are used a lot. With this said, we've already talked about named defaults that frontends/backends could explicitly designate. It could be convenient
for example to have :

   default tcp
       ...

   defaults http
       ...

   defaults admin
       ...

   frontend foo
       use-defaults http

and so on. This combined with the ability to put tcp-check sequences in
the defaults sections could actually address a huge number of limitations. This could also work to designate which log format to use, while I was more thinking about a named log profile. This proves that a bit more thinking is
still needed in this area.

Okay I think my example with 'defaults' wasn't the right one I thought to have
check apps sections similar the fcgi and cache app.

check-app tcp01
  ...

check-app tcp02
  ...

check-app http01
  ...

check-app http02
  ...

backend foo
  use-check http01

I think this makes more sense, right?


It would be nice to be able to reuse the feature
(tcp|http)-(request|response)
for the checks.

Maybe, maybe not, I really don't know to be honest, because it can
also add a lot of connfusion compared to a send/expect model for
example.

Well when every check-app have there own parts then maybe it is more useable.

And you know there will be some distributed setups where the status from a
backend
should be shared with different haproxy instances, maybe via peers protocol,
this
will be maybe only possible in the commercial version ;-).

This was something that I intended many years ago already, even before
we had the peers protocol, and that we've even discussed during the
protocol's design to be sure to leave provisions for such an extension,
and v2 of the protocol with shared connections made a giant step forward
in this direction. I even wrote down on paper a distributed check
algorithm that will take less time to re-design than to figure on what
sheet of paper it was :-)

;-)

But in the mean time the definition of a "server" has changed a lot. For
the LB algos, it's a position in the list of server. For stickiness it
used to be a numeric ID (defaults to the position in the list) and is a
name now. For those who used to rely on SNMP-based monitoring it also
was this numeric ID. For some people it's the server's name (hence the
recent change). Now with service meshes it tends to move to IP:port
where the server's name is irrelevant. Users are able to adapt their
monitoring, stats and reporting to all these conditions, but when it
comes to defining what to exchange over the wire and what is
authoritative it's a completely different story!

In addition, in such new environments, it's common to see a central
service decide what server is up or down and advertise it either via
the API or the DNS, in which case checks just become very basic again.

Last but not least, in highly distributed environments you really do
not want your neighbor to tell you what server is up when it uses a
different path than you use.

So while I was initially really fond of the idea and wanted to see it
done at least for the beauty of the design, I must confess that I'm
far less impatient nowadays because I predict that it will require lots
of tunables that most users will consider unwelcome. And I really doubt
it will provide that much value in modern environments in the end.

Sounds fairly reasonable, it was just a Idea.

Just my two cents,
Willy

Regards
Aleks

Reply via email to