Joseph Harvell wrote:
Richard:
Richard B. Gilbert wrote:
How about designing your NTP subnet in such a way as to prevent these failures
in the first place?
Since you say, elsewhere, that you are more concerned that time be strictly
monotonically increasing than that it be accurate perhaps you don't need NTP at
all; set your local clock from your wrist watch once a week while the
application is not running
Your original problem, IIRC, resulted from an extremely poor design of your NTP
subnet; two servers each serving its unsynchronized local clock and drifting
apart.
If you really do need NTP the easiest configuration is for your client to use
from four to seven servers. Those servers should be stratum 2 internet servers
(rules of engagement prohibit use of public stratum 1 servers unless you are
serving 100 or more clients). This requires that you study the list of public
stratum two servers at http://ntp.isc.org/bin/view/Servers/StratumTwoTimeServers
to find four to seven servers within, say, 300 miles of your site and adding
these servers to your ntp.conf file. It also requires a connection to the
internet that allows port 123 in both directions. If you specify the numeric
IP address of each server, you need not open any other port in the firewall.
If you wish to use domain names, the you will have to open the port(s)
necessary to allow DNS to work (don't know which ones offhand.
The simplest configuration is to make the machine running the application a stratum 1
server by installing ntpd and a GPS timing receiver as a hardware reference clock. The
weakness of this configuration is that the GPS receiver becomes a single point of
failure; if it dies, you rapidly lose any claim to accuracy. Since you don't insist on
accuracy perhaps this would not be a problem. Actually, ntpd would continue to
discipline the clock using the last known frequency correction so you would have several
hours of "hold over" before your clock drifted significantly (assuming a
controlled temperature in your data center).
You can increase the reliability by using four GPS timing receivers to
synchronize four NTP servers and configuring your client to use those four
servers.
Richard:
I really appreciate the advice. I think you are getting the wrong idea
about my approach to handling the problem since I don't seem concerned
about the glaring problems in my configuration. The reason for this is
the original problem manifested in a testbed for one of our products. I
am concurrently tracking this down internally to determine whether the
two servers are actually synced to a stratum 1 clock (or whether they
are part of the same synchronization subnet at all). And I plan to
correct the problem.
Also, I completely agree that we should configure 4+ peers for each NTP
client to avoid this failure scenario altogether.
But keep in mind that it may not be practical for our customers to have
4+ NTP servers in their synchronization subnet. And arguably, they
deserve what they get if they fail to follow our recommendation to have
more servers.
Nevertheless, I am still very interested in preventing step corrections
in these scenarios. And I think this is a legitimate concern. So I
would really appreciate it if you could also address the questions in my
post.
Thanks
---
Joe Harvell
I lack the expertise to answer your question as now put! I've never
done such a thing or needed to.
The tinker keyword is, IMHO, well named! My understanding is that it is
intended for "tinkering" rather than for production use. It lets you
experiment without having to modify the code and rebuild each time.
It's, AFAIK, unsupported; if NTP malfunctions while you are using tinker
and you report it, the reply is likely be "then don't do that!"
IF your customers use NTP, it's THEIR responsibility to design and
operate their synchronization subnet properly. It's YOUR responsibility
to warn your customers that horrible things may happen if time is
stepped while your application is running.
Also note that ntpd is not the only method of managing computer clocks;
there is SNTP, "Open" NTP, the "daytime" protocol, rdate (Unix/Linux),
etc, etc. Some people use ntpdate in a cron job and that WILL, by
default, step the time. NTP is probably the best if you need/want
accurate time but the other means hang on for various reasons.
_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions