> -----Mensagem original----- > De: questions [mailto:questions- > bounces+nuno.pereira=g9telecom...@lists.ntp.org] Em nome de Brian Inglis > Enviada: quarta-feira, 1 de Julho de 2015 21:53 > Para: questions@lists.ntp.org > Assunto: Re: [ntp:questions] NTP client with 4 servers lost sync > > On 2015-07-01 10:18, Nuno Pereira wrote: > > Following last night's leap second, we had some issues with our NTP servers, > > especially in a clients with 4 servers configured, but not in clients with 1 > > source configured. > > > We have 2 types of configuration (beside the one in the NTP server): > > > Config 1 (clients with access to the external network): > > * 2 NTP servers in the LAN, configured with "iburst prefer"; > > * 2 external NTP servers, configured with "iburst". > > > Config 2 (clients without access to the external network): > > * 1 NTP server in the LAN, configured with "iburst prefer" or "iburst" > > (in this case to "prefer" or not is the same"). > > > The 2 external servers configured had problems with the leap second, having > > one second offset after it happen, while the LAN servers got no issues (they > > had a leap file, and reported leap_armed within the 24 hours before the > > event). > > > > This lead to something like this being reported by "ntpq -p" (don't have > > prints): > > > remote refid st t when poll reach delay offset jitter > > > ============================================================ > ================== > > xlan_server_1 160.45.10.8 2 u 1013 1024 377 1.019 -0.483 0.687 > > xlan_server_2 160.45.10.8 2 u 922 1024 377 1.042 -0.499 0.665 > > xext_server_1 194.117.9.137 2 u 384 1024 377 3.360 1002.688 0.790 > > xext_server_2 194.117.9.139 2 u 388 1024 377 3.360 1001.582 0.833 > > > I mean, all 4 were considered false tickers. > > > In the meanwhile, in the clients where I had no access to the external > > network, having only 1 server to sync to (lan_server_1), things worked with > no > > problem. > > >>From what I've read in this list and in the docs, the best configuration is to > > have 4 servers, and that's what's brought by default in the CentOS and > Debian > > servers, but this issue brought again the even number of servers issue that > > can arise with just 2. > > > How can 4 be worst than 1? > > > > Do I have to go to a 5 servers configuration, in order to avoid this? Or go > > for 4 servers in the LAN? > > > > I'm having difficulties to convince my colleagues that we must configure 4 > > servers (they think that exaggerated), with them thinking that the best is to > > have just one, and now I got this issue. > > See the select and prefer doc pages. > To get sync, you need a majority clique, with more truechimers than > falsetickers, > so with two of each, you don't get a majority, and none are considered reliable. > That is why pool servers are recommended as backup with external access, in > case > some local sources go down or false. > At least three sources internal or external are preferable to allow a majority > clique even if one source goes down or false; more if you need to allow for > possible network issues. > > Also note that prefer means only that source, if it is a survivor, will be used > for system offset and jitter stats, rather than the combine algorithm output. > With more than one surviving preferred source, implementation details decide > which wins. > It is intended for use mainly with local device drivers, as well as to mark a > source to provide seconds numbering for PPS sources.
The use of prefer was based on this idea: "I want to use only the local sources, unless all fail, and so I have to use an external". >From what I read now, I saw that prefer was a bad choice. I could only see the "noselect" option in order to accomplish that idea (only use external if all the local fail), but that also fails, as the external sources aren't available if the local ones fail. Am I right? > You may want to consider adding all LAN sources to all clients, add enough LAN > sources to provide an odd number, add pool servers as backup to external > servers, > and drop prefer from LAN sources to allow the combine algorithm to compute > stats. All LAN sources are just 2 (in reality just one for the moment, as they're the same host). >From my experience, the pool servers, if taken directly from pool.ntp.org, are very unstable and not trustable, and so I chose 2 fixed not so bad external servers as backup. But they failed in the leap second. But how can you explain that a client with just one source was better than a client with 4 sources? It's not just from this leap second situation: it's from some months where we have some clients with just one source that are having less problems that the 4 sources configuration. 4 is not odd, I know, but in that case I have to go for a 5 sources configuration, as a 3 source configuration can fall into a 2 available sources configuration if one of them fails? And as we prefer to use local sources, in order to have all of our clients with better accuracy between them, in that case we would need to have 5 servers of NTP! It's a little insane, in our opinion. > -- > Take care. Thanks, Brian Inglis > _______________________________________________ > questions mailing list > questions@lists.ntp.org > http://lists.ntp.org/listinfo/questions Nuno Pereira G9Telecom _______________________________________________ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions