Hello Baptiste,

I'm sorry if my comments are blunt, but I think this discussion is
important and I do not want my messages to be ambiguous. I do
appreciate all the work you are doing in the DNS subsystem.

On 21 February 2018 at 18:05, Baptiste <bed...@gmail.com> wrote:
>> However in Haproxy the administrator *explicitly* configures a higher
>> payload size, because this higher payload size is probably actually
>> *required* to get all backend servers. Silently downgrading the
>> payload size is harmful in my opinion here.
> Maybe there is a misunderstood here.
> HAProxy already downgrade "silently" the accepted payload size. I mean, this
> "feature" already exists.
> This patch simply ensure the downgrade happens in timeout cases only.

I am aware that this patch is merely addressing a corner case of this
already existing feature, I'm not saying the patch doesn't fix that
corner case. I am saying that I have a strong opinion about this
feature in the first place and I responded to this thread merely
because I was not aware of this feature previously.

>> Ok, but you can see how ignoring the TC flag, combined with automatic
>> payload size downgrade will make the backend servers number fluctuate
>> with a little bit of packet loss? So with 40 servers in DNS and the
>> loss of a single packet we will downgrade the entire backend to
>> whatever fitted in the 1280 byte downgraded response.
>> I would call this behavior highly undesirable.
> You can play with "hold obsolete <timer>"  to ensure that unseen records are
> kept for <timer> time.

I don't see how this is supposed to address the problem. Payload size
is downgraded permanently (as I've found out below), not only per
retry request, so we will forever use 1280, which will not contain the
complete server set (after all, that's why the admin raised the
payload size in the first place), so we will be missing - possibly a
lot of - backend servers until we reload haproxy (which will work
until the first DNS packet is lost), than haproxy will degrade again
when the first DNS packet lost.

>> Failing over to TCP on TC responses is the proper solution to this
>> all, but in the meantime I would argue that we should make the
>> behavior as predictable and stable as we can.
>> Let me know what you think,
> I think that until we can do DNS over TCP, it is safer to downgrade the
> announced accepted payload size in case of a timeout and ensure we'll still
> receive answer than never ever downgrade and stop receiving answers.

I disagree (strongly). We can't do anything about TC and DNS over TCP
in haproxy 1.8. But it is my opinion that auto downgrading accepted
payload size, while ignoring TC, is problematic in *a lot* of
situations, with a very questionable benefit.

When is the auto downgrade supposed to help? When we have fluctuating
PathMTU issues in a datacenter I think we should fail hard and fast,
not hide the problem. Other than that (hiding PathMTU problems) it
will kneecap the loadbalancer when a single IP fragment of a DNS
response is lost, by reducing the amount of available servers.

Sure, if we would write lookup code for a recursive DNS server we
should implement this fallback in every case. But we would also have
proper TC handling in that case, so we would not work with truncated
responses and we would certainly not permanently downgrade our ability
to get large responses.

> As I explained above, this "feature" (downgrade) already exists in HAProxy,
> but is applied to too many cases.
> This patch simply ensure that it is applied to timeouts only.
> So I don't see what's wrong with this patch.

I don't disagree with the patch, I disagree with the feature; I am
fully aware that it already exist, I was not aware of the feature
before you send the patch.

You mentioned an RFC suggestion, and I believe RFC6891 #6.2.5 may be
what you are talking about:

And it indeed suggests to fall back to a lower payload size, however:

- it is a MAY: "A requestor MAY choose to implement a fallback"
- it is kind-of implied (when reading the entire paragraph) that this
is only relevant when the *default* payload size is above 1280 (so not
relevant to haproxy, we don't default above 1280, we don't even enable
edns0 by default), it's imo irrelevant when the admin is actually
configuring the payload size
- its obvious the TCP fallback on TC responses has to work
- it imo also implies a fallback for the next retry-request, not a
fallback for the entire runtime of the application

> - create a new command on the CLI to set the accepted payload to a value
> decided by the admin (so he can perform an upgrade in case a downgrade
> happened)

I don't understand; are you saying the feature currently downgrades
the payload size for the resolver *permanently*, not just for the next
retry? Therefor, when haproxy downgrades *currently*, we have to
actually reload haproxy to restore the accept payload size? Well, that
at least fully explains what the user reported here:

> Now, I agree the "silent" application of the downgrade could be problematic
> and we could imagine some solutions to fix this:

Number 1 on this list should be to document this behavior in the doc
for the accepted_payload_size keyword (and while there we should also
mention that we do not fallback to TCP on TC responses). Logging is
fine, completely disagree on the CLI keywords: rather than making the
admin jump through hoops by making him workarounds the harm this
feature causes externally (via scripting on the admin socket), we
should make the actual feature configurable.

I still think we should remove it entirely though.

Baptiste, I don't think you'd find the symptoms I have in mind
acceptable on a load-balancer, so there has to be a misunderstanding
here. I would like to do some tests, maybe I can come up with a simple
testcase that shows the behavior and then we can review the situation
based on that testcase; I will probably need a few days for this

Thanks and again, I really appreciate all the work on the DNS resolver
you are doing.

Reply via email to