On 12/12/2017 4:03 PM, Wes Hardaker wrote:
Michael StJohns <[email protected]> writes:

A "perfect" system will behave the way you've described - but adding a
safety factor while ignoring the phase shift brought on by retransmits
within the addHoldDown interval will not characterize the actual
system.
Ah ha!  So, you do actually agree that my description of the perfect
case is true, which means we really have been in violent agreement about
the "perfect" line in the sand.  [as I've said: I absolutely understand
the point your making about real world issues, such as timing drifts]

This is what you got out of all of that?   What I said was that doing any analysis starting from the "perfect" model would not lead you to the right place.  Your description of the perfect case is correct in its behavior and totally irrelevant to figuring out the final answer.

And as I said in the last message, I agreed that a delay/phase-shift
made sense and I'd be happy to produce a new term for it.  Ideally I'd
like to add that into the safetyMargin because it reflects real-world
conditions,
No - it doesn't.  Please look at the diagrams I provided and think about them for a day before responding.    I think you're still confusing system behavior with client behavior.


and I was trying to contain that to just one particular
term.  But I understand you don't want it buried in there, so I'll
create a new term to deal with timing phase shifts and add that before
the existing safetyMargin.
No - please don't. Please use the math I gave you.  It is correct.

Here's the whole diagram including retransmits in all the possible places and assuming a best case start compared against a "perfect" model with the same best case start:




The only effect of the retransmissions within the addHoldDown time is to shift when the final query happens.  In a pool of 10K clients, you'll get a percentage of those clients taking an entire activeRefresh interval after THEIR addHoldDown expires.

In no case does any calculation that involves an "activeRefreshOffset" provide a worst case analysis. In no case does a "phaseShift" term provide any value over assuming that the addHoldDown expires just after the last query in the addHoldDownInterval.

So again:

sigExpiration + activeRefresh + addHoldDown + activeRefresh + retransmission slop/aka safetyFactor.






Can we stop now?
I think so as I believe I can add words that we'll both agree to.  But
I've said that before, so...


The only remaining questions, just to double double be sure:

1) "is one activeRefresh period long enough to account for the slop
    associated with time clock drifts?"

    I'd argue yes, as any clock that drifted longer than an activeRefresh
    (min = 1hr) is a seriously broken clock.  I think you agree.
I don't know why you think this matters.  Assume that drift + retransmissions is enough to get the phase shifted to the worst case (e.g. just before the end of the addHoldDown period for at least one client) and we're done with this analysis.


2) "is one activeRefresh period long enough to account for network
    delays and other elements, aside from 'retries and missing queries'?"

    I think you and I agree on this too, that one should be sufficient to
    cover network delays too.

Again, I don't know why you think this matters.   If this is the safety factor, then no - you're asking the wrong question.  If this is just about accounting for the actual time to accomplish a query then you only have to account for the round trip delays for the query before the addHoldDown and the round trip delays for the query after.  Any delays during the addHoldDown time only cause the phase to shift.  Those delays are more than covered by a single fastQuery interval and can be ignored with any reasonable safetyFactor.

For the safetyFactor you want to consider how large the pool of queriers and how often the queries fail to figure out how big to make the safety factor.  This ends up looking like a cumulative distribution function and you're looking to pick a cutoff far enough along the curve that you get most (tm) of the clients having completed the installation before moving on. https://en.wikipedia.org/wiki/Binomial_distribution#/media/File:Binomial_distribution_cdf.svg for example.

Later, Mike



_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to