[ntp:questions] Re: NTP clients not syncing up to servers?

Richard B. Gilbert Tue, 11 Oct 2005 19:30:07 -0700

Ted Beatie wrote:

  server <one or more servers, external or internal>
  server <one or more other gateways, using the back-end addresses>

Add iburst to the end of each server line. This speeds up synchronization.


To all of the server lines, or just the internal-to-our-system servers?

  server <two or more gateways, using the back-end addresses>
Three servers are an absolute minimum because 2 means it has no way ofknowing which is providing better information. Let's leave aside thequestion of the meaning of the word "better", it's a very complicatedsubject.


As I mentioned to Tom, what if we can't guarantee that?  As near as I
can tell, whereas more is better, the only actual requirement is for one
server.  In some cases, we're lucky if we get even one, so we either
need to believe that one, or we need to set the time manually.

Based on the above the internal NTP server has a stratum of 2 and willalmost always be used over a stratum of 4. Is that internal NTP servergetting its data from a stratum 1 server and is it internal or external?


It is internal, and looks like it gets it's time from other internal machines;

portal-01:~# ntptrace -n
127.0.0.1: stratum 3, offset 0.000006, synch distance 15.20248
10.16.4.1: stratum 2, offset -2.558634, synch distance 1.00000
10.16.4.100: stratum 2, offset -2.571121, synch distance 1.00000
10.16.100.2: stratum 2, offset -2.520537, synch distance 0.04373
132.163.4.101:  *Timeout*

By obfuscating the addresses it's hard to know if you've also removedthe Tally Codes which indicates what gateway1 thinks of the servers.Since you are using the private address space for this it really doesn'tmatter if they're seen. If you don't want to show the names, just add a-n and it won't translate the IP addresses.


As I mentioned in the post, the tally codes were spaces.

portal-01:~# ntpq -nc pe localhost
    remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
10.16.4.1       10.16.4.100      2 u   40   64  377    0.280  -2558.0   4.447
10.123.123.2    10.123.123.1     4 u  810 1024  377    0.172  -1849.0   2.014
10.123.123.3    0.0.0.0         16 u  679 1024    0    0.000    0.000 4000.00

This only has two servers and you need at least 3. As it is gateway1 andgateway2 are at two different stratum levels. However you need to fixthe problem first on the gateways.


Despite the spec, that seems to be a consistent interpretation.  If
everything internal is fully meshed, and there is only one external time
source, will everything sync up to that external source, no matter the skew?

Looking at the debugging techniques, and seeing that the tally code is
a space, and delving deeper, I see;

  gateway1:~# ntpq -c as localhost
  ind assID status  conf reach auth condition  last_event cnt
  ===========================================================
  1 47900  9014   yes   yes  none    reject   reachable  1
  2 47901  9014   yes   yes  none    reject   reachable  1
  3 47902  8000   yes   yes  none    reject

  storage-node2:~# ntpq -c as localhost
  ind assID status  conf reach auth condition  last_event cnt
  ===========================================================
  1 16076  9064   yes   yes  none    reject   reachable  6
  2 16077  9064   yes   yes  none    reject   reachable  6

Usually you will see these kinds of results when the server you arelooking at has just started. You really need to give it time to synchronize.


Not in this case;

portal-01:~# ps aux|grep ntp;for i in 2 51 52 53 54; do ssh -1
10.123.123.$i ps aux; done | grep ntp
root   11283  0.0  0.1  2328 2320 ?        SL   Sep30   0:05 /usr/sbin/ntpd
root   17856  0.0  0.1  2328 2320 ?        SL   Sep30   0:04 /usr/sbin/ntpd
root     382  0.0  0.1  2328 2320 ?        SL   Jun13   0:04 /usr/sbin/ntpd -g
root     382  0.0  0.1  2328 2320 ?        SL   Jun13   0:04 /usr/sbin/ntpd -g
root     383  0.0  0.1  2328 2320 ?        SL   Jun13   0:04 /usr/sbin/ntpd -g
root     389  0.0  0.1  2328 2320 ?        SL   Jun13   0:05 /usr/sbin/ntpd -g

(the Sep30 processes are on the two gateways, the Jun13 processes are on
the servers.  I had recently manually stopped ntpd, resync'd the times,
and restarted ntpd on the gateways)

This appears to indicate it received just one packet which is not enoughto synchronize anything. How long did you wait for the server after itwas started to interrogate this server? You need to wait at least 15-20minutes when you don't use iburst.


How long would it take with iburst set?  How can we deal with the fact
that the gateways and servers all generally come up at the same time?

            --ted

--
Ted Beatie                         Permabit, Inc.             [EMAIL PROTECTED]
Sr. Systems Engineer       One Kendall Sq, Cambridge, MA       +1-617-995-9317

_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions

Like it or not, you have a dependency tree. Ntpd on the clients is notgoing to work until there is at least one server running andsynchronized The server is not going to synchronize with an externalserver until the network is up.

With iburst specified in the server statements, you can be synchronizedin a couple of minutes. After the first five replies are received fromthe upstream server(s) ntpd has enough information to STARTsynchronizing your clock. That's ten or twelve seconds after ntpdstarts. If it's a "warm start" (you have a drift file) and the powerhas not been off for very long, synchronization can be very fast. Ifit's a cold start; e.g. you have no drift file and/or power has been offlong enough for the internal temperature of the machine to changesubstantially, ntpd can bring your clock within twenty or thirtymilliseconds in the first two or three minutes. To get as good as itcan get, will require several hours.

You will need to bring up the network, your routers and switches,first. Then bring up your ntp server(s) Then start your clients. Yes,it's probably going to take five to ten minutes to get all the clocks inrough synchronization (within twenty or thirty milliseconds of thecorrect time) this way. Most sites minimize the problem by minimizingshutdowns. If the site is only used three times a month and powereddown between uses, you are just going to have to be patient and wait forthings to synch up well enough to satisfy you.


_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions

[ntp:questions] Re: NTP clients not syncing up to servers?

Reply via email to