I wanted to add to the conversation.

>Harlan wrote:
>> From: Harlan Stenn <[email protected]>
>> Sent: Wednesday, December 7, 2011 6:56 PM
>> 
>> OK, and exactly why do you need "the time offset in the ballpark before ntpd 
>> starts"?
>   <snip>
>> I believe we're gonna have both "ntp-wait" and "ntpd --wait-sync" behave as 
>> before, but we will also have a way to say
>> "sync means XXX" as well, because some folks are fine with sync meaning "We 
>> do not expect the clock to step backwards,
>> but we're still homing in on the steady-state" and others want this 
>> steady-state.  There may be more choices...
>
>
>Bruce wrote:
>> From: Bruce Lilly <[email protected]>
>> To: [email protected]
>> Sent: Wednesday, December 7, 2011 6:03 PM
>> Also, I have not seen machines "fully sync'd" in anywhere near as quickly as 
>> 11 seconds, but my interpretation of "fully sync'd" may
>> differ from yours; I consider a machine "fully sync'd" only when time 
>> offset, frequency offset, and jitter (as reported in
>> loopstats) have all stabilized, which may take a few hours.


I've another perspective, albeit presented generically, and something that is 
related to those points I think.  Harlan's specific question isn't exactly what 
I'm trying to answer (nor do I think I know how to), but it sparked my train of 
thought about relevant issues.


There are environments where uptime is critical.  A premium is placed on 
resolving down systems and restoring them to operational production in the 
minimal amount of time.  Various applications, databases, and audit/security 
logging infrastructures (especially for post-analysis) require accurate time 
throughout the enterprise.  Emphasis is placed on standing up new systems (with 
improved baseline, apps, etc...) as soon as possible.  Etc...


Setting aside for the moment all the other system building and/or 
troubleshooting activity that occurs---when a boot/reboot is called for, one of 
the essential elements is establishing accurate time and ensuring it remains 
accurate through to the end of the life of the current boot.  The resultant 
environment, whatever it is, needs to get functional (and thus productive) as 
soon as possible.
-- accurate, around here is generally agreed upon, when 'correct to within 
10-15 ms'; though certainly not more than 1 second
-- 'life of the current boot' can be upwards of 6-9 months


Due to latencies experienced (similar to the 'few hours' predicament Bruce 
refers to above) getting the system time to be accurate, it simply isn't 
acceptable to have systems/applications/databases wait 'a few hours' for time 
to stabilize, and remain accurate, before bringing the functional 
applications/databases of the systems online.  The end need is for systems to 
boot, time to get set accurately, or accurately enough (and quickly!) and then 
bring the system online for production use.  ntp is expected to keep the time 
accurate (or accurate enough) from then on.

We've struggled with this issue, numerous times, especially when 
lack-of-time-accuracy has bitten.  ntp is something I'd like to better 
understand, if not master.  However.....

borrowing a relevant post from a recent, though different thread:
>> From: Richard B. Gilbert <[email protected]>
>> Sent: Tuesday, November 29, 2011 5:26 PM
>> Subject: Re: [ntp:questions] Ginormous offset and slow convergence
>> 
>> On 11/29/2011 1:42 PM, Pete Ashdown wrote:
>>>  Is there anything I can do to decrease the convergence time?
>> 
>> Little or nothing!  NTPD can, and sometimes does, take ten hours to reach 
>> "steady state".  It needs about thirty minutes to find a 
>> reasonable facsimile of the correct time.  For the next nine hours and 
>> thirty minutes, it will refine that value until it's as good as it's 
>> going to get.

An IT tech in my world would be fired, probably on the spot, for giving the 
above answer as justification for why getting a system operational takes so 
long (and we don't have near enough people as it is).

I am certainly not an ntp guru; most of us where I work know a little bit, some 
don't know that much (about ntp  :) ).  I've tried to keep up on the ntp 
website and mailing list in order to learn something.  I work on NTP when I 
have to, as one of many duties.  The underlying essence of what Pete asked is 
really relevant.  I've come to learn enough, that there is underlying truth 
behind what Mr. Gilbert said---but don't understand enough so that I know why, 
much less what to practically do about it.  There is a compelling desire to 
have ntp expeditiously get time accurately and promptly, with little to no fuss 
and without the steep learning curve to get there.  I think I can handle 
ntpdate itself going away, as long as I have a reliable mechanism to learn, use 
well and apply to numerous, various, different environments and achieve 
consistent results.

So, my intent isn't to fan flames---I'm characterizing one situation for 
further consideration, documentation and/or exploration.

R,
-Joe Wulf
_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

Reply via email to