On Thu, May 13, 2021 at 10:58 PM Andrew Rodland <andr...@vimeo.com> wrote:

> At Vimeo we have a custom tool since 2015 that monitors the membership of
> clusters of servers, templates out a config with servers assigned to
> backends, and manages reloading haproxy. We're looking into replacing this
> with something a bit more off-the-shelf, and one of the options is
> HAProxy's own DNS service discovery support.
>
> We're also using URI-based load balancing with consistent hashing, and the
> stability of that mapping is important to us. Temporary disagreements while
> membership is changing are inevitable, but we want the portion of the hash
> space that a backend server sees to change as little as possible during its
> lifetime, and for multiple haproxies running the same config, against the
> same cluster, to converge on the same mapping. Our existing tool assigns a
> persistent ID to each server, which is mapped to an "id" option in the
> server line, which has worked quite well.
>
> From what we've seen in testing so far, using "server-template" with DNS
> *doesn't* give us the behavior we want — the assignment of servers to slots
> seems inconsistent, maybe depending on some combination of the order of
> answers in the DNS packet or the order that new server appearances are
> observed by haproxy.
>
> Long story short:
>
> 1. Is my interpretation right?
>
> 2. Would you be open to a patch to change that? I'm thinking of something
> like setting puid from a hash of the SRV name or the A address, "open
> addressing" style, with who goes first in case of a collision determined by
> lexicographic order — but I'm quite open to guidance.
>
> Or should I just look somewhere other than the DNS service discovery?
>
> Thanks,
>
> Andrew Rodland
>
> (Please CC, I'm not on the list.)
>
>
Hi Andrew,

Inconsistency of server order in HAProxy configuration is related to DNS
server implementation. HAProxy just processes the records as they are in
the DNS response and so goes for assignment.
DNS servers roundrobin the AN records to ensure that clients on internet
will be themselves roundrobined against the server. Point is simple, each
client individually use the first AN record found in the payload, so
changing order is important.
So first, If HAProxy (or your internal infrastructure) is the only client
for this DNS server, maybe you could check if this one have an option to
avoid roundrobining the AN records. That should do the trick.
Internally, HAProxy will simply take the first server slot available when a
new AN record is discovered.

You could influence this behavior as well in HAProxy itself. The function
resolv_validate_dns_response() (in resolvers.c) is the place where we turn
the buffer payload into an internal DNS structure. If you can influence the
ordering here, it should help solving this issue. I would also recommend
using server-state to ensure the ordering and parameters are transferred
across reloads.
You also have some other options, such as:
- use a third party tool, outside of HAProxy, to perform the DNS
resolution, sort the result in a consistent way and push it into HAProxy
(you can use the GO client library
https://github.com/haproxytech/client-native)
- implement your consistent hash in Lua apply it to a use-server directive
in your backend (this might impact performance)

Baptiste

Reply via email to