(with apologies for breakfast/iPad MIME crime that surely follows)
> On Feb 8, 2018, at 01:02, Paul Wouters <p...@nohats.ca> wrote:
>> On Wed, 7 Feb 2018, Robert Story wrote:
>>> On Wed 2018-02-07 10:43:16-0500 Paul wrote:
>>> How about using this query to also encode an
>>> uptime-processstartedtime value? Maybe with accurancy reduced to
>>> minutes. I think that would return valuable data.
>> -1 for feature creep and the technical reasons Joe mentioned.
> We have a giant hole in our understanding of why there are updated
> nameservers running the latest software with the older keys. We
> need to gain understanding and we know we need more data.
I don't disagree with the need for more data, but I think the hole you mention
is not so giant. As far as I can tell it's a result of:
1. RFC5011 support not being turned on in nameservers that have been upgraded
but whose older, DNSSEC-validating configuration has been preserved across
updates (most cases), and
2. RFC5011 support exercising a code path that requires a writable, persistent
filesystem to store an updated trust anchor, which turns out not to be
available (fewer, but some cases).
These are both BIND9 problems, which I mean as a complement since they are
indicative of (a) widespread use, (b) early implementation of DNSSEC and (c) a
high degree of backwards compatibility in configuration.
The larger question of whether RFC5011 is a practical or sufficient mechanism
given this experience is a reasonable one. You may recall I have been a serial
advocate for adding standardised bootstrap mechanisms that include fetching a
trust anchor out-of-band, for example, which I still think would be a practical
remedy even if a slightly inelegant one; unbound-anchor and its use in package
and system start scripts is, I think, a key reason why the two problems
described above don't show up in unbound.
My sense from the recent KSK rollover/RFC8145 data collection experience is
that the actual impact on end-users from validators dependent on the outgoing
KSK is very small. This is hard to quantify with precision, however, because we
are not able to measure the state of most resolvers (e.g. those not reporting
via RFC 8145 or not validating), nor assess their operational impact (e.g. size
of end-user population and impact of validation failures upon them) with any
degree of accuracy.
I think that the sentinel approach of measuring end-user impact from the
end-user perspective gets us much closer to useful data in general. However,
it's not clear to me how even a trusted, accurate sense of uptime across all
resolvers would help with those questions.
DNSOP mailing list