On 12/11/2020 17:40, Matti Picus wrote: > In a one-on-one discussion with Noam in a pre-community call (that, how > ironically, we had time for since we both messed up the meeting > time-zone change) we reached the conclusion that the request is to > clarify whether NumPy's datetime64 represents TAI time [0] or POSIX > time, with a preferecne for TAI time. The documentation mentions POSIX > time[1]. As Stefano points out, there is a couple of seconds difference > between POSIX (or Unix) time and TAI time. In practice numpy simply > stores a int64 value to represent the datetime64, and relies on others > to convert it. The leap-second might be getting lost in the conversions. > So it might make sense to clarify exactly how those conversions deal > with the leap-seconds and choose which one we mean when we use > datetime64. Noam please correct me if I am mistaken.
Unix time is a representation of the UTC timescale that counts 1 seconds intervals starting from a defined epoch. It deals with leap seconds either skipping one interval (never happened so far) or repeating an interval so that two moments in time that on the UTC timescale are separated by one second (for example 2016-12-31 23:59:59 and 2016-12-31 23:59:60) are represented in the same way and thus the conversion from Unix time to UTC is ambiguous during this one second. This happened 37 times since 1972. This comes with the nice properties that minutes, hours and days have always the same duration (in Unix time), thus converting from the Unix time representation to an date and hour and vice versa is fairly easy. The drawback are, as seen above, an ambiguity on leap seconds and the fact that the trivial computation of time intervals does not take into account leap seconds and thus may be shorted of a few seconds (any time interval across 2016-12-31 23:59:59 is off by at least one second if computed simply subtracting Unix times). I don't think these two drawbacks are important for Numpy (or any other general purpose library). As things stand, it is not even possible, in Python, with or without Numpy, to create a datetime or datetime64 object from the time "2016-12-31 23:59:60" (neither accept the existence of a minute with 61 seconds) thus the ambiguity issue is not an issue in practice. The time interval issue may matter for some applications, but the ones affected are aware of the issue and have means to deal with it (the most common one being taking a day off on the days leap seconds are introduced). I think documenting that datetime64 is a representation of fixed time intervals since a conventional epoch, neglecting leap seconds, is easy to explain and implement and allows for easy interoperability with the rest of the world. What advantage would making datetime64 explicitly a representation of TAI bring? One disadvantage would be that `np.datetime64(datetime.now())` would be harder to support as we are trying to match a point in time on the UTC time scale to a point in time in on the TAI time scale. This is trivial for past times (just need to adjust for the right offset) but it is impossible to do correctly for dates in the future because we cannot predict future leap second insertions. This would, for example, make timestamp conversions not be reproducible across announcement of leap second insertions. Cheers, Dan _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion