On Thu, Feb 22, 2018 at 2:04 AM, Lukas Tribus <lu...@ltri.eu> wrote:

> Hello Baptiste,
>
>
>
> On 21 February 2018 at 19:59, Lukas Tribus <lu...@ltri.eu> wrote:
> > Baptiste, I don't think you'd find the symptoms I have in mind
> > acceptable on a load-balancer, so there has to be a misunderstanding
> > here. I would like to do some tests, maybe I can come up with a simple
> > testcase that shows the behavior and then we can review the situation
> > based on that testcase; I will probably need a few days for this
> > though.
>
> So this is what I did: I pulled current haproxy master (5e64286bab)
> and applied your patch on top of it. I also added "hold obsolete 30s"
> to the configuration in all those tests.
>
>
> Two things that I noticed:
> - GoogleDNS and recent Bind instances (and probably many others) don't
> actually truncate the response; they don't add any A records to the
> response when they set TC - so the TC response is not incomplete but
> actually completely empty (repro: use testcase vs 8.8.8.8 max payload
> 1280)
> - OpenDNS (208.67.222.222) actually truncates the response (just like
> old bind instances), however haproxy is unable to parse that response,
> so a TC response from OpenDNS is always rejected (repro: use testcase
> vs 208.67.222.222 max payload 1280)
>
> So surprisingly enough in both of those 2 cases, the "auto-downgrade"
> does not reduce the amount of servers in the backend, instead it kills
> the backend completely. With your patch and with "hold obsolete 30s"
> of course.
>
> What I was actually looking for is a testcase that reduces the amount
> of servers in the backend, but I guess that would require a DNS server
> that actually truncates the reply "old-style" and at the same time
> does not cause haproxy do reject the response, but I don't know what
> haproxy doesn't like about the OpenDNS TC response.
>
>
> Back to the original testcase though:
> - testcase config attached
> - "100_pointing_to.localhost.ltri.eu" returns 100 A records in the
> localhost range, it requires aprox. 1600 byte in payload size
> - we can trigger the "auto-downgrade" very easily by shortly
> interrupting DNS traffic via a iptables rule (iptables -A INPUT -i
> eth0 -s 8.8.8.8 -j DROP && sleep 10 && iptables -D INPUT -i eth0 -s
> 8.8.8.8 -j DROP)
> - after we triggered the auto-downgrade, haproxy does not recover and
> no backend servers will be alive, until we reload
>
>
> Auto-downgrade behaves exactly as expected in our previous
> conversation. The exact end-result depends on the behavior of the DNS
> server. But none of those cases are desirable:
>
> Case 1 (Testcase Google DNS, recent Bind):
> - when auto-downgrade fires the response will be TC without any
> records; Haproxy will disable all servers and the entire backend will
> be down (fix: restart haproxy)
>
> Case 2 (Testcase OpenDNS):
> - when auto-downgrade fires the response will be TC, which haproxy is
> unable to parse;  Haproxy will disable all servers and the entire
> backend will be down (fix: restart haproxy)
>
> Case 3 (assumption based on what ASA on discourse reports, likely old
> Bind):
> - when auto-downgrade fires and the response is TC, TC is ignored
> which means the reply is considered, downgrading the number of servers
> in the backend to a lower number (whatever fitted in the 1280 reply),
> which most likely will overload the existing backend servers (after
> all, there is probably a reason a certain number of servers is in the
> DNS)
>
>
> "hold obsolete" can only help if haproxy is able to recover; but the
> auto-downgrade makes sure no future DNS requests works as expected so
> whatever value "hold obsolete" is set to, once "hold obsolete" is
> over, the problem will show up.
>
>
> Lets talk about the likelihood of an admin configuring a payload size
> above 1280: I think its safe to assume that this is configured based
> on actual needs, so an admin would hit one of the 3 cases above,
> unless I'm missing something. I completely fail to understand the
> benefit of this feature in haproxy.
>
>
> So based on this tests and cases, I would ask you again to consider
> removing this altogether.
>
>
>
> cheers,
> lukas
>


Hi Lukas,

(I was off last week with limited mail access)

First, thanks a lot for all your testing!

Your use case is right and I perfectly understand it and it makes sense to
me.
that said, in my use case, I was using (and meaning) SRV records and using
consul / kubernetes as backend servers.
What I saw is that when the response is too big, the server will send only
the SRV records with TC flag and no "ADDITIONAL" section. So in such case,
it was still acceptable for HAProxy.

According to you, what would be the best option:
- entirely remove this "feature" and consider the admins know what they do
(I include the copy/paste admins that simply copy content of different blog
articles until "it works")
- enable this feature for a single "retry" of the same query? (that would
happen after we did a retry with a different query type)

If the 2nd, we could make it optional, IE the admin would allow the
downgrade explicitly (so no downgrade by default).
If the 2nd, then we could add a new counter in the resolvers' stats to
count the number of queries sent with the downgraded value. So admins can
use this info when troubleshooting

Baptiste

Reply via email to