Re: [ntp:questions] How should an NTP server fail?

David L. Mills Wed, 09 Jun 2010 21:07:34 -0700

David,

On closer examination of the code, the scenario I suggested in myprevious message is not possible. In other words there is no possibilitythe reach register is zero and the tally code is other than space. Thereason for this is that the select algorithm that determines the systempeer and lights the tally codes is called not only when a new update isreceived from any server, but also after four poll intervals when nosample have been received from a server. This means not only does theindicated dispersion increases rapidly, which would greatly reduce itschances of becoming the system peer if other sources were present, butprevents the race condition between the time a poll is sent and the nextupdate is received.

The sysadmins of the world have had almost thirty years to develop usesfor the monitoring facilities first designed by Dennis Fergusson circa1983 and only minor changes since then. When I implemented the tallycodes circa 1992 the intent was that the sysadmin needs only the pecommand and the tally codes do asses the general health and the rvcommand only as diagnostic aid.


Dave

David Woolley wrote:

David L. Mills wrote:
Miroslav,
You might be confusing the server role with the client role. Theserver has one or more upstream sources and downstream clients. Thetally code for each source is displayed by the pe command separatelyat the server and the client. Each time an update is received from asource at either the server or the client the tally codes for allsources are redetermined. If a source is considered invalid,unreachable or the maximum error statistic exceed the selectthreshold, the tally indicator surely will be blank. If a source ismarked as the system peer, it surely is valid and reachable.
This is not the behaviour that the person who started the thread iscomplaining about. He is complaining that the system peer andselected markers are not cleared on the server when it losesreachability to the respective upstream servers. My previous articlewas on the basis that you were not challenging that aspect of his report.
In the real world, most administrators judge whether a server issynchronized by doing ntpq peers and looking for these flags, not bydoing a client request and looking at the error statistics. In fact,relatively few people realise that you need to use rv on theassociations to properly diagnose a failure to select.
In the case you present the server has lost all sources, but remainsa viable choice even beyond that, as long as the maximum error doesnot exceed the select threshold. The user can set this to whatevervalue is appropriate, with default 1.5 s. The point I emphasize isthat the server, even if it has lost all sources, remains conformantto the formal specification. Thus, the time provider does not judgethe quality which the receiver requires; this is specified by thereceiver.
_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions



_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] How should an NTP server fail?

Reply via email to