Re: [ntp:questions] Number of Stratum 1 & Stratum 2 Peers

Mike Cook Fri, 12 Dec 2014 01:58:33 -0800

To close this parenthesis I did the test for leap second only being propagated 
by 1 of three servers and Bill’s hypothesis is confirmed with a couple of 
precisions that I would like to share as it might just be a real life case.


a) To start off , in my test all three servers to my one client are sync’d to 
the same time. One of them has a leap file modified for my test. As UTC is 
defined WITH leap seconds, although all servers are sync’d, this is the ONLY 
one serving UTC. It correctly advertises the upcoming leap.
b) When the leap occurs, the server with the leap file correctly inserts the 
leap, as does the client. The client’s NTP correctly detects the step and after 
a few polls correctly flags the UTC server as falsticker as the majority are 
consistently in disagreement with the now updated clock.
Thu Jan 1 01:06:05 CET 2015
remote refid st t when poll reach delay offset jitter
==============================================================================
*192.168.1.15 .GPS1. 1 u 42 64 377 0.495 999.894 534.506
+192.168.1.17 .GPS1. 1 u 39 64 377 0.564 999.899 654.645
x192.168.1.18 .GPS1. 1 u 66 64 377 0.575 -0.066 0.029
Now we have the full story and the « good » clock has been declared falsticker 
as not part of the majority but the story doesn't end there. A bit later the 
clients clock, which is at the time on UTC with leap second, gets stepped 
forward 1 sec to be in agreement with the majority. This is expected, but we 
have a client which now has not got good time. 
Thu Jan 1 01:11:27 CET 2015
remote refid st t when poll reach delay offset jitter
==============================================================================
192.168.1.15 .GPS1. 1 u 14 16 3 0.488 -0.039 0.038
*192.168.1.17 .GPS1. 1 u 22 64 1 0.516 -0.044 0.031
192.168.1.18 .GPS1. 1 u 14 16 3 0.566 -999.99 0.052
Thu Jan 1 01:12:31 CET 2015
remote refid st t when poll reach delay offset jitter
==============================================================================
Final status with the UTC server redeclared as a falsticker.
Thu Jan 1 01:15:38 CET 2015
remote refid st t when poll reach delay offset jitter
==============================================================================
+192.168.1.15 .GPS1. 1 u 46 64 77 0.488 -0.039 0.047
*192.168.1.17 .GPS1. 1 u 17 64 37 0.520 -0.054 0.032
x192.168.1.18 .GPS1. 1 u 47 64 77 0.575 -999.99 0.053
This test was to verify a worst case scenario but shows that when 
administrators are preparing for a leap, they need to make sure that a majority 
of servers will be making the leap and propagate that info. This is not always 
easy as query commands are routinely blocked by some internet servers.
Note :
There is a possible bug or RFE required somewhere as the clock variable tai is 
not correctly set on the client.
On the server that has the leap file we have the correct update rom 35 to 36 :
mike@raspB4 ~ $ ntpq -c "rv 0 tai"
tai=36
But on the client which has no leap file (and probably because of this) tai has 
been set to 1. So I think that what is happening is that the server notion of 
tai is not propagated to clients.
mike@cubieez2:~$ ntpq -c "rv 0 tai"
tai=1
There will most likely be a leap declared for the end of Jul 1 2015 or latest 
Jan 1 2016 so we have a bit of time yet to clean up the park.



> Le 9 déc. 2014 à 14:20, Mike Cook <michael.c...@sfr.fr> a écrit :
> 
> <snip>
>> 
>> 
>>> 
>>>> Three are fine, as long as only one dies or goes nuts.
>>> 
>>> Again, define "goes nuts". You don't seem to like the term 
>>> "falseticker", so how do you define "goes nuts"? If one "goes nuts" or 
>>> even goes offline, if the remaining two do not agree then it is like 
>>> having no server at all.
>> 
>> No, it is like having two, with one being out. 
>> falseticker is a term with a very specific internal definition. Thus a
>> server whose time is right on UTC could be a falseticker, because the
>> other two servers were both exactly 3 days out, with tiny jitter estimates. 
>> I would say then that you had two servers going nuts, and one good, even
>> though ntpd would say there were two good and one false ticker.
> 
> In fact this does not happen. I just tested the hypothesis.
> What happens depends on how the two wayward get there exaggerated offset:
> a) someone,something resets the date:
>   result: ntp on both those servers crashes due to the panic_stop limit.
> 
> So in this case  the client has only one reference and continues using that. 
> It is not flagged as a falsticker.
> That is normal.
> 
> b) someone restarts ntp on the servers with the wrong date. Here the servers 
> ntpd has no way of knowing that it has bad time and so continues serving 
> normally. 
>   On the client. The running ntp sees immediately a huge offset and huge 
> jitter.
> 
> Tue Dec  9 13:15:04 CET 2014
>    remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> *192.168.1.15    .GPS1.           1 u  320   64  360    0.549    0.040   0.037
> +192.168.1.16    .GPS2.           1 u   37   64  377    0.606    0.006   0.028
> +192.168.1.17    .GPS1.           1 u  309   64  360    0.576    0.027   0.025
> Tue Dec  9 13:16:08 CET 2014
>    remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> 192.168.1.15    .GPS1.           1 u   55   64  341    0.565    0.042 9660780
> *192.168.1.16    .GPS2.           1 u   37   64  377    0.606    0.006   0.024
> 192.168.1.17    .GPS1.           1 u   42   64  341    0.579    0.041 9660773
> 
> After 5 mins the client is unable to resolve this and declares all clock 
> falsetickers and then panics. I did not have ntpd in debug mode here, but it 
> is reasonable to assume that it panics due to the selected clock being too 
> far out and hitting the panic limit.
> 
> Tue Dec  9 13:23:37 CET 2014
>    remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> 192.168.1.15    .GPS1.           1 u   45   64  377    0.596  -255600 155.539
> *192.168.1.16    .GPS2.           1 u   25   64  377    0.614    0.024   0.008
> 192.168.1.17    .GPS1.           1 u   30   64  377    0.583  -255600  52.806
> Tue Dec  9 13:24:41 CET 2014
>    remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> x192.168.1.15    .GPS1.           1 u   43   64  377    0.596  -255600 179.609
> x192.168.1.16    .GPS2.           1 u   23   64  377    0.614    0.024   0.008
> x192.168.1.17    .GPS1.           1 u   27   64  377    0.618  -255599   6.009
> /usr/local/bin/ntpq: read: Connection refused
> Tue Dec  9 13:25:45 CET 2014
> /usr/local/bin/ntpq: read: Connection refused
> 
> This is exactly what happens if the client is restarted.
> 
> clock_filter: n 1 off -255599.997967 del 0.000662 dsp 7.937502 jit 0.000002
> select: endpoint -1 -255600.000806
> select: endpoint  1 -255599.995128
> select: survivor 192.168.1.17 0.002839
> select: combine offset -255599.997967134 jitter 0.000000000
> event at 1 192.168.1.17 903a 8a sys_peer
> clock_update: at 1 sample 1 associd 18641
> event at 1 0.0.0.0 c617 07 panic_stop -255600 s; set clock manually within 
> 1000 s.
> event at 1 0.0.0.0 c61d 0d kern kernel time sync disabled
> 
> So ntp does NOT continue in your test case. Your case may be better if the 
> time difference is less than the panic limit. Say if the two servers do not 
> insert a leap second, but the  « correct » one does. I’ll try that for my own 
> satisfaction if I can figure how to do it.
>> 
>> 
> 
>>> 
>>> 
>>> Brian Utterback
>> 
>> _______________________________________________
>> questions mailing list
>> questions@lists.ntp.org
>> http://lists.ntp.org/listinfo/questions
> _______________________________________________
> questions mailing list
> questions@lists.ntp.org
> http://lists.ntp.org/listinfo/questions
_______________________________________________
questions mailing list
questions@lists.ntp.org
http://lists.ntp.org/listinfo/questions

Re: [ntp:questions] Number of Stratum 1 & Stratum 2 Peers

Reply via email to