Dear Folks,
I would never expect that a combination of these two apparently
innocuous configuration values could cause a massive outage.
This appears to be a very serious bug in Unbound. Does anyone think
this behaviour (described below) is in any way expected?
On 30/10/18 16:50 +1100, Nick Urbanik via Unbound-users wrote:
On 29/10/18 10:14 -0400, Marc Branchaud via Unbound-users wrote:
On 2018-10-28 3:20 p.m., Nick Urbanik via Unbound-users wrote:
On 25/10/18 18:10 +1100, Nick Urbanik via Unbound-users wrote:
I am puzzled by the behaviour of our multi-level DNS system which
answered many queries for names having shorter TTLs with SERVFAIL.
I mean that SERVFAILs went up to 50% of replies, and current names
with TTLs of around 300 failed to be fetched by the resolver, the last
DNS servers in the chain.� What I mean is that adding these two
configuration options (serve-expired: "yes" and cache-min-ttl: 30)
caused an outage.� I am trying to understand why.
Any ideas in understanding the mechanism would be very welcome.
We use 1.6.8 with both those settings, and observed prolonged SERVFAIL
periods.
In our case, the upstream server became inaccessible for a period of
time, but when contact resumed the SERVFAILs persisted.
This behaviour was quite catastrophic, and to me, unexpected.
Do you have any idea of the mechanism behind this failure?
Is there a way to deal better with zero TTL names?
We reduced the infra-host-ttl value to compensate.
Did that bring your system to a functioning condition?
(Why is infra-host-ttl's default 900 seconds? That seems like a long
time to wait to retry the upstream server.)
M.
By multilevel, I mean clients talk to one server, which forwards to
another, and for some clients, there is a third level of caching.
So it was unwise to add:
serve-expired: "yes"
cache-min-ttl: 30
to the server section of these DNS servers running unbound 1.6.8 on
up to date RHEL 7?� Please could anyone cast some light on why this
was so?� I will be spending some time examining the cause.
If you need more information, please let me know.
--
Nick Urbanik http://nicku.org [email protected]
GPG: 7FFA CDC7 5A77 0558 DC7A 790A 16DF EC5B BB9D 2C24 ID: BB9D2C24