On 12/12/2017 4:03 PM, Wes Hardaker wrote:
Michael StJohns <[email protected]> writes:
A "perfect" system will behave the way you've described - but adding a
safety factor while ignoring the phase shift brought on by retransmits
within the addHoldDown interval will not characterize the actual
system.
Ah ha! So, you do actually agree that my description of the perfect
case is true, which means we really have been in violent agreement about
the "perfect" line in the sand. [as I've said: I absolutely understand
the point your making about real world issues, such as timing drifts]
This is what you got out of all of that? What I said was that doing
any analysis starting from the "perfect" model would not lead you to the
right place. Your description of the perfect case is correct in its
behavior and totally irrelevant to figuring out the final answer.
And as I said in the last message, I agreed that a delay/phase-shift
made sense and I'd be happy to produce a new term for it. Ideally I'd
like to add that into the safetyMargin because it reflects real-world
conditions,
No - it doesn't. Please look at the diagrams I provided and think about
them for a day before responding. I think you're still confusing
system behavior with client behavior.
and I was trying to contain that to just one particular
term. But I understand you don't want it buried in there, so I'll
create a new term to deal with timing phase shifts and add that before
the existing safetyMargin.
No - please don't. Please use the math I gave you. It is correct.
Here's the whole diagram including retransmits in all the possible
places and assuming a best case start compared against a "perfect" model
with the same best case start:
The only effect of the retransmissions within the addHoldDown time is to
shift when the final query happens. In a pool of 10K clients, you'll
get a percentage of those clients taking an entire activeRefresh
interval after THEIR addHoldDown expires.
In no case does any calculation that involves an "activeRefreshOffset"
provide a worst case analysis.
In no case does a "phaseShift" term provide any value over assuming that
the addHoldDown expires just after the last query in the
addHoldDownInterval.
So again:
sigExpiration + activeRefresh + addHoldDown + activeRefresh +
retransmission slop/aka safetyFactor.
Can we stop now?
I think so as I believe I can add words that we'll both agree to. But
I've said that before, so...
The only remaining questions, just to double double be sure:
1) "is one activeRefresh period long enough to account for the slop
associated with time clock drifts?"
I'd argue yes, as any clock that drifted longer than an activeRefresh
(min = 1hr) is a seriously broken clock. I think you agree.
I don't know why you think this matters. Assume that drift +
retransmissions is enough to get the phase shifted to the worst case
(e.g. just before the end of the addHoldDown period for at least one
client) and we're done with this analysis.
2) "is one activeRefresh period long enough to account for network
delays and other elements, aside from 'retries and missing queries'?"
I think you and I agree on this too, that one should be sufficient to
cover network delays too.
Again, I don't know why you think this matters. If this is the safety
factor, then no - you're asking the wrong question. If this is just
about accounting for the actual time to accomplish a query then you only
have to account for the round trip delays for the query before the
addHoldDown and the round trip delays for the query after. Any delays
during the addHoldDown time only cause the phase to shift. Those delays
are more than covered by a single fastQuery interval and can be ignored
with any reasonable safetyFactor.
For the safetyFactor you want to consider how large the pool of queriers
and how often the queries fail to figure out how big to make the safety
factor. This ends up looking like a cumulative distribution function
and you're looking to pick a cutoff far enough along the curve that you
get most (tm) of the clients having completed the installation before
moving on.
https://en.wikipedia.org/wiki/Binomial_distribution#/media/File:Binomial_distribution_cdf.svg
for example.
Later, Mike
_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop