TL;DR: I've pushed a new (final before IETF LC?) copy of the document.
Viktor,
Thanks for the excellent and thorough review. I'm very glad you were
willing to take a look at it, since it's a better document because of
it. I failed to add you to the acknowledgment list for -12, but the
change is queued for the future -13.
All changes accepted with a few exceptions/notes as below. '+' signs
indicate the start of my response; all other text was functionally yours
or the documents.
6.1.6. timingSafetyMargin
Mentally, it is easy to assume that the period of time required for
s/Mentally/Naively/
+ NOTE: not changing this one; I wanted the emphasis on thinking
not assuming
6.1.6.1. activeRefreshOffset
Security analysis of the timing associated with the query rate of
s/Security analysis/An analysis/
+ NOTE: changed to A security analysis
numResolvers: The number of client RFC5011 Resolvers
With the successRate and numResolvers values selected and the
definition of retryTime from RFC5011, one method for determining how
many retryTime intervals to wait in order to reduce the set of
uncompleted servers to 0 assuming normal probability is thus:
Here the text needs to be considerably more clear. The first
observation is that "uncompleted servers" is not defined. It
seems this is the expected number of resolvers that failed to
acquire the new trust anchor. If so, this should be stated
clearly.
It is also rather unclear what is normally distributed,
and why such a distribution is reasonable to assume.
+ good point; changed to: ...to wait in order to reduce the set of
resolvers that have not accepted the new trust anchor to 0
is thus:
* REJECTED:
It seems to me that tuning to achieve zero failure cases for
each size of the resolver population is (to put it kindly) not
necessarily sound.
Instead, one might want to achieve an acceptably low probability
of any chosen resolver failing due to random packet loss, and to
handle non-random "short-term" outages (which may last days if
say hypothetically that the US East-coast grid goes down for 3
days, or much more likely some home computer is shut down for
a few weeks, while the administrator is on vacation).
So here, the zone administrator needs to stretch the retry interval
by a fudge factor and/or to a minimum time they're comfortable with,
but I don't see much relevance of "numResolvers" or plausibility of
any sort of "normal distribution" model. When an ISP has an outage,
or power is lost, or there's a DDoS attack, lost queries are highly
correlated.
Therefore, I would discard the table, and just recommend a sensible
fudge factor that is likely to work well enough in practice. Say
strength the retry time by a factor of 5 to account for transient
connectivity loss, software restarts, ... and also set a minimum
time for less random "short-term" outages one is willing to tolerate.
No finite time can protect a resolver or set of resolvers subject to
a sustained long-term DoS, and these will need to be manually rekeyed
once reliable connectivity is restored.
+ Response: There is no perfect solution. In fact, I didn't plan on
including any network loss factors in the document at all, since I
originally intended it to be just an analysis of the math behind the
model and not real-world scenarios. However, consensus was that we
should include at least some real-world outage estimates. Mike
StJohns came up with this model and text. He put a fair amount of
analysis into it, and included a java-based simulation in one mail
message that showed the model works as expected. Thus, though I
think an alternative and potentially simpler model would also work,
we'll stick with his model since the text is already written and
complete and the model been shown to work in simulations.
In the end, as stated earlier in the document, I really think a
much more extensive guide is needed for properly using 5011 in all
kinds of situations. Maybe someone will author such a guide in the
future and a different model for calculating potential losses can
completed and that document can UPDATE this one.
--
Wes Hardaker
USC/ISI
_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop