Yes, for the 1) question indeed.
Basically I have tested with local "out of sync" custom nameserver. And I was observing some inconsistent results of the backend-servers table. That led to this question.

Most of the time I was seeing the state of only from the local nameserver. However sometimes I have seen the "merged" state where all the replies existed together in the table.

It was also observed that amount of the requests made by haproxy to all nameservers is the same even though the local one normally replies faster.

And sorry, forgot to mention we are running haproxy version 2.8.7

On 08/04/2024 10:31, Илья Шипицин wrote:
and particularly your question is "does HAProxy merge all responses or pick the first one or use some other approach" ?

пн, 8 апр. 2024 г. в 10:23, Andrii Ustymenko <andrii.ustyme...@adyen.com>:

    I guess indeed it is not a case of consul-template specifically,
    but more of rendered templates and how haproxy maintains it.

    On 06/04/2024 20:15, Илья Шипицин wrote:
    Consul template is something done by consul itself, after that
    haproxy.conf is rendered

    Do you mean "how haproxy deals with rendered template"?

    On Fri, Apr 5, 2024, 15:02 Andrii Ustymenko
    <andrii.ustyme...@adyen.com> wrote:

        Dear list!

        My name is Andrii. I work for Adyen. We are using haproxy as
        our main
        software loadbalancer at quite large scale.
        Af of now our main use-case for backends routing based on
        server-template and dynamic dns from consul as service
        discovery. Below
        is the example of simple backend configuration:

        ```
        backend example
           balance roundrobin
           server-template application 10 _application._tcp.consul
        resolvers
        someresolvers init-addr last,libc,none resolve-opts allow-dup-ip
        resolve-prefer ipv4 check ssl verify none
        ```

        and in global configuration

        ```
        resolvers someresolvers
           nameserver ns1 10.10.10.10:53 <http://10.10.10.10:53>
           nameserver ns2 10.10.10.11:53 <http://10.10.10.11:53>
        ```

        As we see haproxy will create internal table for backends
        with some
        be_id and be_name=application and allocate 10 records for
        each server
        with se_id from 1 to 10. Then those records get populated and
        updated
        with the data from resolvers.
        I would like to understand couple of things with regards to this
        structure and how it works, which I could not figure out
        myself from the
        source code:

        1) In tcpdump for dns queries we see that haproxy
        asynchronously polls
        all the nameservers simultaneously. For instance:

        ```
        11:06:17.587798 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
        ethertype IPv4
        (0x0800), length 108: 10.10.10.50.24050 > 10.10.10.10.53:
        34307+ [1au]
        SRV? _application._tcp.consul. (60)
        11:06:17.587802 eth2  Out ifindex 4 aa:aa:aa:aa:aa:aa
        ethertype IPv4
        (0x0800), length 108: 10.10.10.50.63155 > 10.10.10.11.53:
        34307+ [1au]
        SRV? _application._tcp.consul. (60)
        11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
        ethertype IPv4
        (0x0800), length 205: 10.10.10.10.53 > 10.10.10.50.24050:
        2194 2/0/1 SRV
        0a5099e5.addr.consul.:25340 1 1, SRV
        0a509934.addr.consul.:26010 1 1 (157)
        11:06:17.588097 eth2  In  ifindex 4 ff:ff:ff:ff:ff:ff
        ethertype IPv4
        (0x0800), length 205: 10.10.10.11.53 > 10.10.10.50.63155:
        2194 2/0/1 SRV
        0a5099e5.addr.consul.:25340 1 1, SRV
        0a509934.addr.consul.:26010 1 1 (157)
        ```

        Both nameservers reply with the same response. But what if
        they are out
        of sync? Let's say one says: server1, server2 and the second
        one says
        server2, server3? So far testing this locally - I see
        sometimes the
        reply overrides the table, but sometimes it seems to just
        gets merged
        with the rest.

        2) Each entry from SRV reply will be placed into the table under
        specific se_id. Most of the times that placement won't
        change. So, for
        the example above the most likely 0a5099e5.addr.consul. and
        0a509934.addr.consul. will have se_id 1 and 2 respectively.
        However
        sometimes we have the following scenario:

        1. We admistratively disable the server (drain traffic) with
        the next
        command:

        ```
        echo "set server example/application1 state maint" | nc -U
        /var/lib/haproxy/stats
        ```

        the MAINT flag will be added to the record with se_id 1

        2. Instance of application goes down and gets de-registered
        from consul,
        so also evicted from srv replies and out of discovery of haproxy.

        3. Instance of application goes up and gets registered by
        consul and
        discovered by haproxy, but haproxy allocates different se_id
        for it.
        Haproxy healthchecks will control the traffic to it in this case.

        4. We will still have se_id 1 with MAINT flag and application
        instance
        dns record placed into different se_id.

        The problem comes that any new discovered record which get
        placed into
        se_id 1 will never be active until either command:

        ```
        echo "set server example/application1 state ready" | nc -U
        /var/lib/haproxy/stats
        ```

        executed or haproxy gets reloaded without state file. With
        this pattern
        we basically have persistent "records pollution" due to
        operations made
        directly with control socket.

        I am not sure is there anything to do about this. Maybe, if
        haproxy
        could cache the state not only of se_id but also associated
        record with
        that and then if that gets changed - re-schedule healtchecks.
        Or instead
        of integer ids use some hashed ids based on dns/ip-addresses of
        discovered records, in this case binding will happen exactly
        in the same
        slot.

        Thanks in advance!

        --

        Best regards,

        Andrii Ustymenko


--
    Andrii Ustymenko
    Platform Reliability Engineer

    office +31 20 240 12 40

    Adyen Headquarters
    Simon Carmiggeltstraat 6-50, 5th floor
    1011 DJ Amsterdam, The Netherlands




    Adyen <https://www.adyen.com>

--

Andrii Ustymenko
Platform Reliability Engineer

office +31 20 240 12 40

Adyen Headquarters
Simon Carmiggeltstraat 6-50, 5th floor
1011 DJ Amsterdam, The Netherlands




Adyen <https://www.adyen.com>

Reply via email to