Hello and thank you for your reply.
Unfortunately, the solution does not appear to be so simple, and I could
still use some assistance troubleshooting this.

I've made the appropriate adjustments to the config file, replacing the
server's address on each network with each network's broadcast address.
Also I've added the ntp servers that are to be queried in such an order
so that in the case that their shared primary time source becomes
unavailable, each can sync from the other, by means of either of the
networks connecting them.

I remain unable to make sense of this behavior.

Until now I've always relied on the internet-provided sources.
Occasionally there's drift that I hope to eliminate by shifting queries
to the most central of all servers.
Am I understanding correctly that by adding the broadcast addresses to
the list, that these 2 servers can be queried just as the stratum 1 time
source server out on the internet is?


-


For what it's worth, my intention is for all these networks to sync from
server A 'Bascule', with source redundancy provided by server B
'Planck'.
Server A's time will be defined by either the external stratum 1 time
source first, or from server B as a fallback.
Server B's time will be defined by either the external stratum 1 time
source first, or from server A as a fallback.
Both server A & B will have a cron job to update their hardware clock
every other month, interleaved.


-


Their connectivity status is currently as follows.

Server A can reach the internet through one of it's networks, but has no
DNS.
Server B can reach the internet through one of it's networks, and can
resolve DNS.
Neither can reach the other through the first network connecting them
(san.sss.local), due to switching issues.
Both can reach each other through the second network that connects them
directly without a switch (ssan.sss.local).

In this scenario I expect to be able to retain synchronized time because
server B can synchronize from the external stratum 1 source, and provide
that time to server B through the second network that connects them
which remains functional when the first fails.


-


1. In this example I attempt to sync the time of server A, 'Bascule'.

1a. It skips the primary time source server entirely, which is not
expected.

1b. It fails to resolve the second which is expected since that network
is currently offline.

1c. It appears to resolve the third entry incorrectly, showing the IP
address of the primary server which was skipped, instead of
Planck.ssan.sss.local's IP address as it is defined by the associated
/etc/hosts entry, this is not expected.


  Bascule:~# ntpq -p
       remote           refid      st t when poll reach   delay   offset
        jitter
  ==============================================================================
   Planck.san.sss. .INIT.          16 u    - 1024    0    0.000    0.000
     0.000
   Planck.ssan.sss 209.51.161.238   2 u  417 1024    0    0.000    0.000
     0.000
   172.27.100.255  .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   172.27.101.255  .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.30.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.60.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.90.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.120.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.150.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.180.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.210.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.20.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.40.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.60.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001


-


2. Afterward I run a few pings to verify address resolution.

2a. It is verified that the primary time source server is unresolvable,
which is expected since external dns resolution is currently unavailable
to this machine.

  Bascule:~# ping clock.nyc.he.net
  ping: unknown host clock.nyc.he.net



2b. The secondary time server acting as the first backup time source is
resolved correctly by means of it's entry in /etc/hosts, and a reply is
not received since that network is offline, which is expected.

  Bascule:~# ping Planck.san.sss.local
  PING Planck.san.sss.local (10.15.60.15) 56(84) bytes of data.
  From Bascule.san.sss.local (10.15.60.10) icmp_seq=1 Destination Host
  Unreachable
  From Bascule.san.sss.local (10.15.60.10) icmp_seq=2 Destination Host
  Unreachable
  From Bascule.san.sss.local (10.15.60.10) icmp_seq=3 Destination Host
  Unreachable
  ^C
  --- Planck.san.sss.local ping statistics ---
  5 packets transmitted, 0 received, +3 errors, 100% packet loss, time
  4008ms
  , pipe 3



2c. The secondary time server acting as the second backup time source is
resolved correctly by means of it's entry in /etc/hosts, and a reply is
received, which is expected. I expected the time to sync from here. Why
it did not is not understood.

  Bascule:~# ping Planck.ssan.sss.local
  PING Planck.ssan.sss.local (10.15.90.15) 56(84) bytes of data.
  64 bytes from Planck.ssan.sss.local (10.15.90.15): icmp_seq=1 ttl=64
  time=0.177 ms
  64 bytes from Planck.ssan.sss.local (10.15.90.15): icmp_seq=2 ttl=64
  time=0.161 ms
  ^C
  --- Planck.ssan.sss.local ping statistics ---
  2 packets transmitted, 2 received, 0% packet loss, time 999ms
  rtt min/avg/max/mdev = 0.161/0.169/0.177/0.008 ms
  Bascule:~#


-


3. Next I attempt to sync the time of server B, 'Planck'.

3a. It doesn't skip the primary time source server, and syncs properly
from it, this is expected since external DNS is available to this
machine.

3b. While the refid column entries for the second & third time source
servers match each other, neither are as server A had listed this
server's entries. I expected each local server to be the same. Why this
does not match is not understood.

  r...@planck:~# ntpq -p
       remote           refid      st t when poll reach   delay   offset
        jitter
  ==============================================================================
  *clock.nyc.he.ne .CDMA.           1 u   71  128  377   53.900   -1.183
    1.072
   Bascule.san.sss .STEP.          16 u    - 1024    0    0.000    0.000
     0.000
   Bascule.ssan.ss .STEP.          16 u   36 1024    0    0.000    0.000
     0.000
   10.15.60.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.000
   10.15.90.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.000


4. When I attempt to sync manually from the secondary time server acting
as the backup time source, it fails to do so since that network is
offline, which is expected.

  r...@planck:~# ntpq -p Bascule.san.sss.local
  Bascule.san.sss.local: timed out, nothing received
  ***Request timed out


5. When I attempt to sync manually from the secondary time server acting
as the second backup time source, it appears to query itself while also
resolving itself incorrectly, showing that it has synchronized by means
of the primary server's IP address, instead of from
Bascule.ssan.sss.local's IP address as is defined by the associated
/etc/hosts entry, this is not understood. I expected it to sync from
Bascule.ssan.sss.local, as was explicitly defined by the command.

  r...@planck:~# ntpq -p Bascule.ssan.sss.local
       remote           refid      st t when poll reach   delay   offset
        jitter
  ==============================================================================
   10.15.60.15     .STEP.          16 u    -  512    0    0.000    0.000
     0.000
  *Planck.ssan.sss 209.51.161.238   2 u   49   64  377    0.154   15.342
    3.566
   172.27.100.255  .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   172.27.101.255  .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.30.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.60.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.90.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.120.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.150.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.180.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.15.210.255   .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.20.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.40.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001
   10.90.60.255    .BCST.          16 u    -   64    0    0.000    0.000
     0.001


-


Here are the pertinent sections of each config.


  Server A 'Bascule':

  # You do need to talk to an NTP server or two (or three).
  #server ntp.your-provider.example
  server clock.nyc.he.net dynamic
  server Planck.san.sss.local dynamic
  server Planck.ssan.sss.local dynamic

  # By default, exchange time with everybody, but don't allow
  configuration.
  #restrict -4 default kod notrap nomodify nopeer noquery
  #restrict -6 default kod notrap nomodify nopeer noquery
  restrict 172.27.100.0 mask 255.255.255.0 nomodify notrap
  restrict 172.27.101.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.30.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.60.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.90.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.120.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.150.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.180.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.210.0 mask 255.255.255.0 nomodify notrap
  restrict 10.90.20.0 mask 255.255.255.0 nomodify notrap
  restrict 10.90.40.0 mask 255.255.255.0 nomodify notrap
  restrict 10.90.60.0 mask 255.255.255.0 nomodify notrap

  # If you want to provide time to your local subnet, change the next
  line.
  # (Again, the address is an example only.)
  broadcast  172.27.100.255
  broadcast  172.27.101.255
  broadcast  10.15.30.255
  broadcast  10.15.60.255
  broadcast  10.15.90.255
  broadcast  10.15.120.255
  broadcast  10.15.150.255
  broadcast  10.15.180.255
  broadcast  10.15.210.255
  broadcast  10.90.20.255
  broadcast  10.90.40.255
  broadcast  10.90.60.255


Server B 'Planck':

  # You do need to talk to an NTP server or two (or three).
  #server ntp.your-provider.example
  server clock.nyc.he.net dynamic
  server Bascule.san.sss.local dynamic
  server Bascule.ssan.sss.local dynamic

  # By default, exchange time with everybody, but don't allow
  configuration.
  #restrict -4 default kod notrap nomodify nopeer noquery
  #restrict -6 default kod notrap nomodify nopeer noquery
  restrict 10.15.60.0 mask 255.255.255.0 nomodify notrap
  restrict 10.15.90.0 mask 255.255.255.0 nomodify notrap

  # If you want to provide time to your local subnet, change the next
  line.
  # (Again, the address is an example only.)
  broadcast 10.15.60.255
  broadcast 10.15.90.255



I hope I've provided sufficient information to determine what's wrong,
it seems alot to me at least.

Thanks again for taking the time to look it over.


-C

















On Sun, 21 Nov 2010 17:59 +0000, "David Woolley"
<[email protected]> wrote:
> [email protected] wrote:
> 
> > # (Again, the address is an example only.)
> > broadcast  172.27.100.10
> 
> These are not valid broadcast addresses.
> 
> Whilst I suspect the broadcast mechanism might be usable with unicast 
> addresses, it would seem a strange thing to do.
> 
> Also, I believe it is not uncommon to block routed traffic to the 
> sub-net broadcast address.
> 
> _______________________________________________
> questions mailing list
> [email protected]
> http://lists.ntp.org/listinfo/questions
> 
_______________________________________________
questions mailing list
[email protected]
http://lists.ntp.org/listinfo/questions

Reply via email to