Hal Murray wrote: > > Does anybody know of a good writeup of how to fix POSIX to know about leap > seconds and/or why POSIX hasn't done anything about it yet?
I've made a number of presentations and whitepapers about leap seconds and problems related to them. However I'm not aware of an easy, good solution. The basic problem is that common API calls used to retrieve the system time stamp from an OS don't provide a status that could be used to distinguish between a normal second and an ongoing, inserted leap second. Without such status a function that converts a timestamp to calendar date and time doesn't know if the timestamp associated with an inserted leap second should yield a second with number '60' of the current day, or a second '00' of the next day. I think an API that provides a timestamp and an associated status in a *consistent* way would be too "expensive" with regard to execution time because some locking mechanism needed to be implemented to avoid that inconsistent timestamp and status information could be returned. On the other hand, if an application already has a broken down date and time (e.g. from an external time source, serial string or whatever), it knows that the time is the leap second if the second number is '60'. So the '60' is the required status information. However, if you "normalize" the time of a leap second, e.g. 23:59:60, then 60 seconds carry over to one minute, 60 minutes to 1 hour, and 24 hours to the next day. So when computing an associated timestamp, the effective timestamp is the same as for 00:00:00 the next day, and the information that this second is a leap second is simply lost unless otherwise preserved in some way. On most Unix-like systems the system time is kept as POSIX time, and when a leap second is inserted, the kernel just steps the system time back by 1 second. *This* is what confuses applications that don't expect that the system timestamp ever goes backward. Many years ago Dave Mills has proposed a way to avoid this problem by stopping the system clock for 1 s, and doing a system time increment by the smallest unit whenever an application retrieves a system time stamp while the clock is stopped. This would be a good workaround since the time returned from the kernel never steps back, and there are no duplicate timestamps because even during the leap second the timestamps increment by a small amount. Quite some time ago I asked one of the Linux kernel maintainers why they don't implement the leap second handling this way, and the answer was just "because it's too expensive". Whenever an application queried the current system time, the kernel had to check if a leap second was just being inserted, and if there was, some small amount of time had to be added to the stopped system time. And all this just to get it "right" for 1 second in several years. Just stepping the time back by 1 second at a certain point in time is much faster, and much easier to implement. > I assume the basic idea would be something like switch the kernel to use TAI > rather than UTC and implement conversion in some library routines. I think this could be a good approach, but this requires that not only leap second announcements but also the current TAI offset is supplied to the OS. Runtime libraries require the correct TAI offset to be able to convert the TAI system time to UTC. E.g., the kernel could return a TAI timestamp, and the runtime library function like gmtime() needs to know the current TAI offset and leap second date to be able to return a time with a 60 in the seconds field. I know there are leap second files from IERS or NIST, and also current versions of the TZ database contain a copy of such file, but this requires that this information is continuously updated, i.e. new versions of the leap second files or the TZ data base are supplied. This should work (I haven't tried it) if you configure the system with one of the "right" timezones, which gets the TAI information from a table of the TZ DB. This may work with current versions of operating systems, where the OS maintainers or the admin provides the required information, but may not work reliably for systems that are out of support, or embedded systems that never get any update. Thes would use a wrong UTC time after the next leap second. In the past, ntpd could provide its clients also with the TAI offset if autokey was configured. Since autokey is now obsolete and replaced by NTS, an extension field to the NTP packet could provide the TAI offset. PTP/IEEE 1588 works based on TAI, and the protocol provides the TAI to UTC offset, so either PTP or NTP with TAI offset extension field could be used to adjust the kernel time to TAI. Both ntpd and PTP clients should be able to write the current TAI offset and leap second announcments to the kernel. However, AFAIK time conversion functions only retrieve the TAI information from the TZ DB, not from the kernel, so the TAI capabilities of NTP and PTP clients don't really help if the TZ DB isn't updated. So, IMO, there were API calls required that the runtime library could use to query the current TAI offset, a TAI timestamp of the next leap second (if one has been scheduled), and the TAI offset after the next leap second from the kernel. So the system could use information provided via NTP or PTP even without further updates of a leap second file or the TZ DB. > There is a discussion on the IETF ntp list with typical S/N for this topic. Above I tried to write a summary from my point of view, and I hope it's not considered as noise. ;-) Martin _______________________________________________ LEAPSECS mailing list [email protected] https://pairlist6.pair.net/mailman/listinfo/leapsecs
