Hi Alex,
On Sat, Mar 26, 2022 at 08:30:56PM +0100, Aleksandar Lazic wrote:
> I fully agree with "using DNS for service discovery is a disaster." and the
> DNS
> was the easiest way in the past for service discovery.
>
> A possible solution could be that there is a registration API in HAProxy which
> uses the dynamic server feature to add them self to a HAProxy
> backend/listener.
That's one of the conclusions we've started to come to. But additionally
we're seeing some requirements about being able to restart to apply certain
changes and at this point it appears clear that this needs to be managed by
an external process, and given that the existing dataplane API already deals
with all that, it makes more sense to distribute the effort that way:
- make haproxy's API more friendly to the dataplane API
- make the dataplane API able to communicate with fast-moving
registries.
> There should be a shared secret to protect the HAProxy API against attacks and
> can only be used via TLS.
In fact I would like us to have a new applet ("service" as exposed in the
config) for this, that could be either called via "use-service blah", or
be used by default on the existing CLI when HTTP is detected there. The
current CLI's language has zero intersection with HTTP so it should be
easy to let it adapt by itself. That would be cool because it already
supports permissions etc and it makes sense that the same set of controls
and permissions is used to perform the same actions using two different
languages.
> I would suggest JSON as it is the more or less a standard for API interaction.
> The benefit of JSON is also that's not another cli is necessary to maintain
> just for that feature.
We came to that conclusion as well. The current CLI format is a big problem
to deal with for external components. For example recently Rémi added some
commands to show OCSP responses and by accident we noticed line breaks there,
that come from the openssl output. So he had to reprocess openssl's output
to eliminate them because on the current CLI they're delimiters. The CLI
was designed for humans and is best used with "socat readline /path/socket".
A program ought not have to read messages made for humans nor deal with such
a syntax.
> I'm not sure if it's know that envoy uses the "xDS REST and gRPC protocol" for
> there endpoint config. I'm not sure if "xDS REST" brings any benefit in
> HAProxy but maybe we can get some Ideas how the problem is solved there.
>
> https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol
>
> The terminology helps to understand some of the xds parts.
> https://www.envoyproxy.io/docs/envoy/latest/intro/life_of_a_request#terminology
It's different but covers more aspects (frontends, rules, etc). It's yet
another reason for placing that in an external component that would speak
natively with haproxy. This way more protocols could be adopted with less
efforts (and sometimes libs are also provided in high-level languages).
> If we scale the view a little bit out of "add backend servers to HAProxy" and
> thinking like "Add some bunch of backends to a haproxy clusters" we can think
> about to use something like the raft protocol ( https://raft.github.io/ ).
Don't know, maybe. I never heard about it before.
> Because most companies out there have not only one HAProxy instance there are
> running at least two instances and therefore is required to have a solution
> which could work with more the one instance o HAProxies.
For sure! That's by the way another big problem posed by DNS: you cannot
even keep your LBs consistent because they all receive different responses!
> To add the raft protocol would of course increase the complexity of HAProxy
> but
> offers the handling of join/remove of backends and in HAProxy can then the
> dynamic server feature be used to add the new backend to the backend section.
Well, be careful, it's important to think in terms of layers. HAProxy is a
dataplane which deals with its traffic and its own servers. It doesn't deal
with other haproxy nodes. However it totally makes sense to stack layers on
top of that to control multiple haproxy nodes.
> The benefit from my point of view is to have a underlying algorithm which
> offers
> a consensual handling of join/remove of Servers/Endpoints.
One of the problem of performing such an approach at too low a layer is
that at some point you have to speak to one node and hope that it spreads
the info to the other ones. That's bad in terms of high availbility because
it means that you trust one node a bit too much at one critical instant.
Also there are plenty of multi-site architectures in which some central
management components have access to all LBs but LBs cannot see each
other, or only within the same LAN+site.
> Maybe the peers protocol could also be used for that part as it is already
> part
> of HAProxy.
That's exactly the type of thing we wanted to do long ago and that I'm now
convinced we must *not* do, precisely for the reasons above. The peers
protocol must definitely improve to be more efficient in terms of network
and CPU processing for the much richer messages it transports nowadays, but
it should remain limited to sharing *activity* information, and in no way
*config* that one node would appear as authoritative on.
Really, when you see all the mess that the DNS stuff has revealed for those
trying to put that in place in multi-LB clusters, you quickly figure that
no single node should dictate anything to others. That's a role for an
upper layer component. And I suspect that one reason we've been late in
following the modern approaches to architectures is that we used to have
a lot of features inside that didn't require an external controller, so
when others had to develop new features in their controller, we only had
to slightly extend what we already had (DNS, support remapping of server
IDs over peers protocol, etc). Except that at some point it appeared clear
that external controllers can also force a fleat of LBs to act in a
consistent manner, while all the intelligence you put in each of them
will never suffice to make them autonomously do that without such a
controller. So when other components were developing from scratch some
stuff we already had, at the same time we didn't immediately see that it
also offered them a better opportunity for horizontal scalability and that
this would become the norm over time. That's why we must not insist too
long in a dead end with this DNS stuff.
> That's my 2c, I'm just brainstorming for some options which comes to my mind.
> :-)
Thanks for sharing and for the Raft link.
Willy