@JimBiardCics wrote: > For the moment, let's set aside the question of names for the calendars.
OK, though a bit hard to talk about :-) > In a perfect world, all data producers would use a leap-second-aware function > to turn their lists of time stamps into elapsed time values and all time > variables would be "perfect". almost -- I think TIA time is also a perfectly valid system for a "perfect" world :-) But yeah, most datetime libraries do not handle leap seconds, which, kida ironically, means that folks are using TIA time even if they think they are using UTC :-) > Does naive conversion of UTC time stamps into elapsed times have the > potential to produce non-monotonic time coordinate variables that violate the > CF conventions? Yes. Does it cause any real problems (for the vast majority > of cases and instances of time) if people use this "broken" method for > encoding and decoding their time stamps? No. I'm not so sure -- I think having a time axis that is "non-metric" as you call it can be a real problem. Yes, it could potentially be turned into a series of correct UTC timestamps by reversing the same incorrect math used to produce it, but many use cases are working the the time access in time units (seconds, hours, etc), and need it to be nifty things like monotonic and differentiable, etc. > we are trying to make a way for data producers to signal to data users how > they should handle the values in their time variables while staying within > the existing CF time framework and acknowledging CF and world history > regarding the way we deal with time Fair enough -- a worthy goal. >There is nothing at all wrong with specifying that the epoch time stamp in the >units attribute always be a correct UTC time stamp. In fact, allowing the >epoch time stamp to be from a TAI or UTC clock will increase the chances that >the data will be handled incorrectly. If you are sophisticated enough to care >about TAI, you will have no problem dealing with a UTC time stamp. I disagree here -- the truth is that TAI is EASIER to deal with -- most datetime libraries handle it just fine in fact, it is all they handle correctly. So I think a calendar that is explicitly TAI is a good idea. I think we are converging on a few decisions: 1) Due to legacy, uninformed users, poor library support, and the fact that it just doesn't matter to most use cases, we will have an "ambiguous with regard to leap seconds" calendar in CF. Probably called "gregorian", because that's what we already have, and explicit or not, that's what is means with existing datasets. So we need some better docs here. 2) Do we need an explicit "UTC" calendar, in which leap seconds ARE taken into account. The file would only be correct if the timestamp is "proper" UTC, and you would get the right (UTC) timestamps back if and only if you used a leap-second-accounting for time library. The values themselves would be "metric" (by Jim's definition) 3) Do we need an explicit "TAI" calendar. The file would only be correct if the timestamp is "proper" TAI, and you would get the right (TAI) timestamps back if and only if you did not apply leap seconds. The values themselves would be "metric" (by Jim's definition). Note that the only actual difference between (2) and (3) is that the timestamp is in UTC or TAI, which are different since some time in 1958, but up to 37 seconds. In either case, the values themselves would be "proper", and you could compute differences, etc easily and correctly. 4) minor point -- do we disallow "days" in any of these, or be happy with 1day == 24 hours == 86400 seconds. I'm fine with days defined this way -- it is almost always correct, and always what people expect. (though it could cause issues, maybe, with some datetime libs, but only those that account for leap-seconds, so I doubt it) 5) I think this is the contentious one: Do we have a calendar (encoding, really) that is: Elapsed time since a UTC timestamp, but with elapsed time computed from a correct-with-regard-to-leapseconds UTC time stamp with a library that does not account for leap seconds. This would mean that the values themselves may not be "metric". I think this is what Jim is proposing. (by the way, times into the future (when leap-seconds are an unknown) as of the creation of the file should be disallowed) Arguments for (Jim, you can add here :-) ) * people are already creating time variables like this -- it would be nice ot be able to explicitly define that that's what you've done, so folks can interpret them exactly correctly. * since a lot of instruments., computers, etc. use UTC time with leap seconds applied, and most tiem processing libraries don't support leap seconds -- folks will continue to produce such data, and, in fact have little choice but to do so. Arguments against: * This is technically incorrect data -- it says "seconds since", but it isn't actually always seconds since. We should not allow incorrect data as CF compliant. Bad libraries are not CF's responsibility. * A time axis created this way will be non-"metric" - that is, you can't compute elapsed time correctly directly from the values -- this is likely to lead to confusion, but worse still, hard to detect hidden bugs -- that is, code that works on almost any dataset might suddenly fail if a value happens to fall near a leap-second, and you get a zero-length "second" (or even a negative one? -- is that possible). * (same as above, really) -- a time variable of this sort can only be used correctly if it is first converted to UTC timestamps. * There may be issues with processing this data with some (most?) time libraries (in particular the ones that don't account for leap-seconds). This is because if you convert to a UTC timestamp with leap-seconds, you can get a minute with 60 seconds in it, for example: December 31, 2016 at 23:59:60 UTC And some time libraries do not allow that. Example python's datetime: ``` In [3]: from datetime import datetime In [4]: datetime(2016, 12, 31, 23, 59, 60) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-4-a8e1ba1d62e5> in <module>() ----> 1 datetime(2016, 12, 31, 23, 59, 60) ValueError: second must be in 0..59 ``` Given these trade-offs, I think CF should not support this -- but if others feel differently, fine -- but do not call it "UTC" or "TAI"! -- and document it carefully!. That last point is key -- this entire use-case is predicated on the idea that folks are working with full-on-proper-leap-second-aware UTC timestamps, but processing them with a non-leap-second-aware library -- and that this is a fully definable and reversible process. But at least with one commonly used datetime library (Python's built-in datetime). It simply will not work for every single case -- it will work for almost every case, so someone could process this data for years and never notice, but it's not actually correct! In fact, I suspect most computer systems can't handle: December 31, 2016 at 23:59:60 UTC, and will never give you that value -- rather, (IIUC) they accommodate leap seconds by resetting the internal clock so that "seconds since the epoch" gives the correct UTC time when computed without leap seconds. But that reset happens at best one second too late (so that you won't get that invalid timestamp). All this leads me to believe that if anyone really cares about sub-second-level precision over a period of years, then they really, really should be using TAI, and if they THINK they are getting one-seconds precision, they probably aren't, or have hidden bugs waiting to happen. I don't think we should support that in CF. Final point: > When you read time out of a GPS unit, you can get a count of seconds since > the GPS epoch, and I believe you can get a GPS time stamp that doesn't > contain leap seconds (like TAI time stamps, but with a fixed offset from > TAI), but most people get a UTC time stamp. The GPS messages carry the > current leap second count and receivers apply it by default when generating > time stamps. OK -- but I suspect that yes, most people get a UTC timestamp, and most people don't understand the difference, and most people don't care about second-level accuracy over years. The over years part is because if you have, say, a GPS track you are trying to encode in CF, you should use a reference timestamp that is close to your data -- maybe the start of the day you took the track. So unless you happen to be collecting data when a leap second occurs, there will be no problem. For those few people that really do care about utmost precision -- they should use the TAI timestamp from their GPS -- and if it's a consumer-grade GPS that doesn't provide that -- they should get a new GPS! It's probably easier to figure out how to get TAI time from a GPS than it is to find a leap-second-aware time library :-) Side note: Is anyone aware of a proper leap-second aware time library?? Sorry for the really long note -- but I do think we are converging here, and I won't put up a stink if folks want to add the sort-of-utc calendar -- as long as it's well named and well documented. -CHB -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/cf-convention/cf-conventions/issues/148#issuecomment-434805310
