server <one or more servers, external or internal>
server <one or more other gateways, using the back-end addresses>
Add iburst to the end of each server line. This speeds up synchronization.
To all of the server lines, or just the internal-to-our-system servers?
server <two or more gateways, using the back-end addresses>
Three servers are an absolute minimum because 2 means it has no way of
knowing which is providing better information. Let's leave aside the
question of the meaning of the word "better", it's a very complicated
subject.
As I mentioned to Tom, what if we can't guarantee that? As near as I
can tell, whereas more is better, the only actual requirement is for one
server. In some cases, we're lucky if we get even one, so we either
need to believe that one, or we need to set the time manually.
Based on the above the internal NTP server has a stratum of 2 and will
almost always be used over a stratum of 4. Is that internal NTP server
getting its data from a stratum 1 server and is it internal or external?
It is internal, and looks like it gets it's time from other internal machines;
portal-01:~# ntptrace -n
127.0.0.1: stratum 3, offset 0.000006, synch distance 15.20248
10.16.4.1: stratum 2, offset -2.558634, synch distance 1.00000
10.16.4.100: stratum 2, offset -2.571121, synch distance 1.00000
10.16.100.2: stratum 2, offset -2.520537, synch distance 0.04373
132.163.4.101: *Timeout*
By obfuscating the addresses it's hard to know if you've also removed
the Tally Codes which indicates what gateway1 thinks of the servers.
Since you are using the private address space for this it really doesn't
matter if they're seen. If you don't want to show the names, just add a
-n and it won't translate the IP addresses.
As I mentioned in the post, the tally codes were spaces.
portal-01:~# ntpq -nc pe localhost
remote refid st t when poll reach delay offset jitter
==============================================================================
10.16.4.1 10.16.4.100 2 u 40 64 377 0.280 -2558.0 4.447
10.123.123.2 10.123.123.1 4 u 810 1024 377 0.172 -1849.0 2.014
10.123.123.3 0.0.0.0 16 u 679 1024 0 0.000 0.000 4000.00
This only has two servers and you need at least 3. As it is gateway1 and
gateway2 are at two different stratum levels. However you need to fix
the problem first on the gateways.
Despite the spec, that seems to be a consistent interpretation. If
everything internal is fully meshed, and there is only one external time
source, will everything sync up to that external source, no matter the skew?
Looking at the debugging techniques, and seeing that the tally code is
a space, and delving deeper, I see;
gateway1:~# ntpq -c as localhost
ind assID status conf reach auth condition last_event cnt
===========================================================
1 47900 9014 yes yes none reject reachable 1
2 47901 9014 yes yes none reject reachable 1
3 47902 8000 yes yes none reject
storage-node2:~# ntpq -c as localhost
ind assID status conf reach auth condition last_event cnt
===========================================================
1 16076 9064 yes yes none reject reachable 6
2 16077 9064 yes yes none reject reachable 6
Usually you will see these kinds of results when the server you are
looking at has just started. You really need to give it time to synchronize.
Not in this case;
portal-01:~# ps aux|grep ntp;for i in 2 51 52 53 54; do ssh -1
10.123.123.$i ps aux; done | grep ntp
root 11283 0.0 0.1 2328 2320 ? SL Sep30 0:05 /usr/sbin/ntpd
root 17856 0.0 0.1 2328 2320 ? SL Sep30 0:04 /usr/sbin/ntpd
root 382 0.0 0.1 2328 2320 ? SL Jun13 0:04 /usr/sbin/ntpd -g
root 382 0.0 0.1 2328 2320 ? SL Jun13 0:04 /usr/sbin/ntpd -g
root 383 0.0 0.1 2328 2320 ? SL Jun13 0:04 /usr/sbin/ntpd -g
root 389 0.0 0.1 2328 2320 ? SL Jun13 0:05 /usr/sbin/ntpd -g
(the Sep30 processes are on the two gateways, the Jun13 processes are on
the servers. I had recently manually stopped ntpd, resync'd the times,
and restarted ntpd on the gateways)
This appears to indicate it received just one packet which is not enough
to synchronize anything. How long did you wait for the server after it
was started to interrogate this server? You need to wait at least 15-20
minutes when you don't use iburst.
How long would it take with iburst set? How can we deal with the fact
that the gateways and servers all generally come up at the same time?
--ted
--
Ted Beatie Permabit, Inc. [EMAIL PROTECTED]
Sr. Systems Engineer One Kendall Sq, Cambridge, MA +1-617-995-9317
_______________________________________________
questions mailing list
[email protected]
https://lists.ntp.isc.org/mailman/listinfo/questions