Ted Beatie wrote:
Tom, thank you for the in-depth reply.


If ntpd finds an offset of more than 1000 seconds, it will
terminate itself unless "-g" is present on the ntpd command
line. In that case, it will make one such adjustment and will
terminate itself if a second such adjustment is required.


I'm confused then;

  server-04:~# ps aux|grep ntpd
  root     389  0.0  0.1  2328 2320 ?      SL   Jun13   0:05 /usr/sbin/ntpd -g

This instance of ntpd has been running since Jun13.  Shouldn't it have
terminated by now?  (not that I want it to terminate..  what I want is
for it to say, "ok upstream server, I think you're on crack, but I'm
going to believe you anyway.")


A look at the code seems to indicate that it exits if it exceeds the panic threshold unless -g is used but it only does it if it actually sets the clock. If the clock never gets set it will never trigger the condition and it will continue. I think that it never synchronized.


Also make sure that each ntpd instance has at least 4
reliable, consistent, and _working_ lower-stratum servers
configured before you even start ntpd for the first time.


What if we can't guarantee that?  The docs certainly seem to say that
the more the better, but that only one is actually required.  Some of
our deployments have public internet access, in which case, we can
populate the ntp.conf file on the gateway machines with as many servers
as we'd like.  But a fair number of our deployments don't have outbound
internet access, and have only one internal NTP server, if even that.
And yet, we still want at a minimum, for all of our machines to be in sync.


You can guarantee that if you have control of your own machines.


In any case, you should make no judgments about whether
ntpd is working properly or not until it has been running
for several hours, sometimes 2 days or more on a previously
unitialized system.


This is also a problem.  Given our situation, the gateways and servers
all get powered on at roughly the same time.  Ideally, what we would
like is for the servers to sync up to the gateways, no matter what they
think of the accuracy of them, just so that they are all in sync.  Then,
if the gateways themselves get more reliable information, from internal
or external upstream servers, that the whole system asjusts accordingly.


Ted, I think that you should reverse the design of your NTP server layout.

Start with your own systems that you have total control over and set them up to point to stratum 1 servers that you own. Setting up your own refclock is fairly straightforward. If you cannot do that for some reason then set them up to point to a bunch of internet stratum 2 servers and get them synchronized. Then set up your gateways and portals to point to these servers and make sure that these are synchronized to your servers. This shouldn't be too hard. Then have the storage servers point to your gateways. Don't bother with the customers' own NTP servers since you can't depend on what they are or how they're set up.

Do all this one step at a time and make sure that each piece is working before you take the next step.

Danny
                  --ted

--
Ted Beatie                         Permabit, Inc.             [EMAIL PROTECTED]
Sr. Systems Engineer       One Kendall Sq, Cambridge, MA       +1-617-995-9317

_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions


_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions

Reply via email to