Re: [LEAPSECS] Leap seconds ain't broken, but most implementations are broken

Martin Burnicki Thu, 05 Jan 2017 01:01:41 -0800

Tony Finch wrote:
> Martin Burnicki <[email protected]> wrote:
>> Tony Finch wrote:
>>>
>>> Even though NTP can represent current UTC correctly, it often gets leap
>>> seconds wrong. It does not give confidence that we will be able to
>>> reduce bugs by teaching more code about leap seconds, when NTP cares
>>> about time and gets it wrong, and most code cares much less.
>>
>> I think this you statement isn't quite fair.
> 
> Probably :-) But I think there are ways to make software more easy to
> operate correctly.


I think e.g. ntpd already implements a lot to make operation as safe as
possible.

> For example, it used to be the case that you had to explicitly configure
> DNS software with the addresses of the root name servers. Nowadays, the
> root hints are compiled in and maintained as part of the usual software
> update mechanism, so the operator has one less thing to worry about.

Yes, and whenever someone tries to ship some time synchronization
software with hardcoded of NTP server addresses then this is considered
abusive, and it's exactly that. ;-)

I think you have to distinguish here. For DNS resolvers (at least at the
top level, ISP, etc.) it is *required* that they know the addresses of
some root server, otherwise DNS might not work at all.

> Maybe a similar scheme would work for leap second files, except that NTP
> servers tend to need much less care and feeding than DNS servers.

Please note that NTP servers not necessarily need to be providers for
leap second files. There are some well known sites which provide this
file, and the NTP software package from ntp.org comes with a script
which can be used to update the file automatically.

The bad thing here is that it probably causes quite some load on these
servers if every single NTP node would regularly check for a new leap
second file.

The potential approach with tzdist or special DNS allowed for a
distributed system, where the special DNS can only provide leap second
warning and the current TAI offset, while tzdist also provides the leap
second history, and a way to update time zone rules, so it could be
generally used to keep also conversion to local time correct.

>> IMO this is similar to ntpd. If it's not provided with an updated leap
>> second file then it may have no idea that a leap second is approaching.
>> If a faulty GPS receiver passes a leap second warning to ntpd, should
>> ntpd not trust the GPS receiver since it knows there are some broken
>> receivers out there?
> 
> At this point you can insert another copy of my unfair complaint, but with
> GPS instead of NTP :-)

Agreed. :-)

Some of the 3rd party GPS firmware bugs we have seen indicate that the
software developers haven't worked carefully enough. But again, a
program like ntpd which has to rely on information e.g. from GPS
receivers can't be made absolutely safe against GPS receiver bugs.

Comparing to your example with DNS: If a root server has a software bug
which lets it deliver a wrong IP address, how should your local DNS
resolver detect this?

>> Current versions of ntpd accept a leap second file if it has been
>> configured, and the file hasn't expired. If no leap second file can be
>> used then a leap second announcement from a refclock is used.
> 
> I wonder if it would be better to set the leap indicator bits to NOSYNC if
> the configured leap seconds file has expired.

Sounds good at the first glance, but I think this would cause much bad
surprise if you have a company network and suddenly all NTP clients stop
to be synchronized.

Instead, current versions of ntpd emit a log message when the leap
second file expires, or is going to expire.

>> This tries to make operation as safe as possible, but this doesn't even
>> help in any case. Imagine your NTP daemon has a valid leap second file,
>> handles the leap second correctly, and also passes the leap second
>> warning to its clients.
>>
>> If this daemon's time sources (GPS receiver, or upstream NTP server(s))
>> don't insert the leap second at the same time then our daemon will
>> observe a sudden 1 s offset after the leap second, and even though it
>> has itself handled the leap second correctly, it will step the time a
>> few minutes later to the wrong time provided by its reference time
>> source(s).
> 
> Shouldn't it treat incorrect upstreams as false tickers? Can't it drop
> them from its list of candidate servers when it realises they got the leap
> second wrong or they will get it wrong?

The basic problem is more with a stratum-1 server which in many cases
gets its time only from a GPS receiver. If the GPS receiver provides
faulty leap second information then the NTP server can hardly detect
this. Even if it has a current leap second file this wouldn't work.

If the daemon inserts a leap second due to the leap second file, but the
GPS receiver doesn't do the leap second then the time will be off by 1 s
after the leap second, and the daemon will accept the wrong time from
the GPS receiver a few minutes afterwards, and step the system time to
the wrong time (this would happen immediately if you restart the daemon
shortly after the leap second).


For a pure client there should be no problem if the client has several
upstream servers configured. Before the leap second, the NTP daemon
accepts a leap second warning only if a majority of the configured
upstream servers provide this warning. However, the time from the faulty
server is still correct. Otherwise it wouldn't have been classified as
good candidate.

When the leap second occurs then all upstream servers as well as our
server insert the leap second, but faulty servers don't. So the faulty
servers which haven't done the leap second are off by 1 s afterwards and
are *then* classified as false tickers.

Of course this also doesn't work correctly if the *majority* of the
configured upstream servers get the leap second wrong, but in the past
we have seen that fortunately most public servers get it right.

Martin

_______________________________________________
LEAPSECS mailing list
[email protected]
https://pairlist6.pair.net/mailman/listinfo/leapsecs

Re: [LEAPSECS] Leap seconds ain't broken, but most implementations are broken

Reply via email to