David,
On closer examination of the code, the scenario I suggested in my
previous message is not possible. In other words there is no possibility
the reach register is zero and the tally code is other than space. The
reason for this is that the select algorithm that determines the system
peer and lights the tally codes is called not only when a new update is
received from any server, but also after four poll intervals when no
sample have been received from a server. This means not only does the
indicated dispersion increases rapidly, which would greatly reduce its
chances of becoming the system peer if other sources were present, but
prevents the race condition between the time a poll is sent and the next
update is received.
The sysadmins of the world have had almost thirty years to develop uses
for the monitoring facilities first designed by Dennis Fergusson circa
1983 and only minor changes since then. When I implemented the tally
codes circa 1992 the intent was that the sysadmin needs only the pe
command and the tally codes do asses the general health and the rv
command only as diagnostic aid.
Dave
David Woolley wrote:
David L. Mills wrote:
Miroslav,
You might be confusing the server role with the client role. The
server has one or more upstream sources and downstream clients. The
tally code for each source is displayed by the pe command separately
at the server and the client. Each time an update is received from a
source at either the server or the client the tally codes for all
sources are redetermined. If a source is considered invalid,
unreachable or the maximum error statistic exceed the select
threshold, the tally indicator surely will be blank. If a source is
marked as the system peer, it surely is valid and reachable.
This is not the behaviour that the person who started the thread is
complaining about. He is complaining that the system peer and
selected markers are not cleared on the server when it loses
reachability to the respective upstream servers. My previous article
was on the basis that you were not challenging that aspect of his report.
In the real world, most administrators judge whether a server is
synchronized by doing ntpq peers and looking for these flags, not by
doing a client request and looking at the error statistics. In fact,
relatively few people realise that you need to use rv on the
associations to properly diagnose a failure to select.
In the case you present the server has lost all sources, but remains
a viable choice even beyond that, as long as the maximum error does
not exceed the select threshold. The user can set this to whatever
value is appropriate, with default 1.5 s. The point I emphasize is
that the server, even if it has lost all sources, remains conformant
to the formal specification. Thus, the time provider does not judge
the quality which the receiver requires; this is specified by the
receiver.
_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions
_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions