Jonathan,

On 6/12/15 12:30 PM, Jonathan Gregory wrote:
Dear Jim

It appears that we have some irreconcilable differences in interpretation but
I am not sure that they make any difference in practice, which is fortunate
if correct.

The definition of a calendar includes (by implication) the rules for how
to convert between an elapsed time count since a reference epoch and
a timestamp as expressed in that calendar.
Yes. The elapsed time count is the time coordinate. That's what I mean by the
rules for encoding and decoding. In addition, the calendar attribute states
the calendar of the reference time. (I can't understand why taking these
together doesn't mean that the attribute implies the calendar of the decoded
timestamps too, but I don't mind if we don't say that! I've always assumed
that the calendar is regarded as a property of the time coordinate construct
as a whole, because it's an attribute of the time coordinate variable.)
I'm good with that!
Yes, I agree that you can change the calendar without modifying the time
coordinate values if you instead alter the reference time so that it refers
to the same instant of real time in the new calendar. (I was considering the
case when you wished to keep the same reference time, say 1st Jan 2015, but
alter the calendar. In that case you have to add an offset to the coordinate
values, which can be done be decoding and reencoding or by other means, but
the method used isn't material to the definition, so there is no need to say
how this is done.)

I agree again. The means is not important in a practical sense. When attempting to define the convention, it's a useful exercise to understand what the "essential" elements are.
Matching timestamps without conversion is exactly what you would
*not* want to do when comparing observations recorded using two
different real-world calendars.
That is true. But it is exactly what you do want to do when comparing real-
world and model calendars, or different model calendars, which is the common
case for a lot of CF data. In that common case, the timestamps are primary.
For example, climatological means, for which we have a CF conversion, assume
that users want to deal with time in a timestamp-based way, in which months
are equivalent regardless of calendar. It is inexact in a sense, but it's what
we need to do. More generally, you could argue that CF is quite a lot concerned
with the general problem of comparing apples and quinces. CF provides metadata
which enables users to indicate that the comparison is valid, by giving the
two things the same label.
Yes. The needs of the observations (and particularly satellite) community and the modeling community are different. A big part of my motivation in all this has been to make sure we don't paint anyone into a corner as we introduce changes to the time conventions. I'd say (as you mention below) that the job of CF is to describe what is present in a file, so that users can make informed decisions about how to use the data. I think it's best if CF doesn't try to tell users how they are supposed to use the data (as you mention below as well).
I don't think the question of whether timestamps are primary makes a material
difference to the convention or its use. It would be fine to state that the
encoded time coordinate is an elapsed time and has no discontinuities, except
for small ones in the case of using the no-leap-second calendar to encode UTC
time. I think we have to support this case because it is almost certainly in
use, perhaps widespread use, and it's better to be explicit about it. CF does
not generally try to tell people what to do, but to enable them to describe
clearly what they have done.
There are almost certainly files "in the wild" that have encoded discontinuities and/or offsets due to unwary application of software. I am completely fine with having some verbiage in the conventions mentioning this, but I'm not so sure about providing a way to tell someone, "I played fast and loose with the time data."
I agree that seconds are always the same length. If only days were too! (But
they are in udunits, so we could warn against using days as a unit of time
in calendar="gregorian_utc" because it could be confusing, although encoding
and decoding would work correctly.)
Time is horribly messy! Of course, from the udunits perspective, all days do have the same length. The confusion crops up if people start with UTC timestamps, encode them correctly, and expect their elapsed day numbers to remain nice integers or half integers (1.5, 2.5, etc) (or equivalent "nice" numbers) over time scales longer than the time between leap second appearances. Or if people start by storing "nice" elapsed times and expect all the results to come out as "nice" timestamps over those same long time scales.

So it sounds like the only real open question is what, if anything, to do about cases where people use the wrong time functions to encode their elapsed times. (We can't do anything about people using the wrong functions to decode.) It seems to me that this amounts to providing a way for people to describe what algorithm was used to do the encode of their elapsed times and what calendar their original timestamps were expressed in (assuming that they started with timestamps), independent of what calendar the reference timestamp is expressed in. I think it's pushing the calendar definition pretty hard to include all three components above into a single package.

As an example, the likely options that I can think of involving a Gregorian calendar with GPS and UTC are (there are plenty more, but I hope that the other combinations are highly unlikely and don't make much difference):

Calendar for reference timestamp
        Calendar for input timestamp
        Encoding algorithm
        Comment
UTC or GPS
        UTC
        UTC
        No errors.
UTC or GPS
        UTC
        GPS
        Possible errors.
Input timestamps assumed leap second free.
UTC or GPS
        GPS
        UTC
        Possible errors. (This option is pretty unlikely.)
Input timestamps assumed to have leap seconds.
UTC or GPS
        GPS
        GPS
        No errors.
UTC or GPS
        none
        none
        No errors. Original times are elapsed times.


The GPS timestamp to elapsed time encoding algorithm is (apart from epoch date & time) the same as the POSIX algorithm assuming you haven't enabled POSIX leap second sensitivity (which is possible, but almost no one does it). It handles Gregorian leap days, but not leap seconds.

Notice that the only times a problem arises are when there are input timestamps and the encoding algorithm doesn't match the input timestamps. The reference timestamp doesn't get involved. Since the only thing that affects the contents of the time variable is whether or not the encoding algorithm matched the input timestamp calendar, we could compress the input calendar and encoding algorithm columns of the table like so:

Calendar for reference timestamp
        Properly encoded
        Comment
UTC or GPS
        yes
        No errors.
UTC or GPS
        no
        Possible errors.
Improper encoding algorithm used.


I'm not convinced that signaling the case where there was an encoding algorithm mismatch is best done by defining calendars that include mention of encoding errors in their definitions. Let's run with the idea though. If we exclude the unlikely cases (such as Gregorian+GPS with encoding errors), we get:

Calendar
        Definition
gregorian_utc
Reference timestamp expressed in Gregorian calendar with UTC time system. Elapsed times are free of leap second errors.
gregorian_utc_lse
gregorian_utc, but elapsed times are not necessarily free of leap second errors. There is no exact conversion between this and other calendars.
gregorian_gps
Reference timestamp expressed in Gregorian calendar with GPS time system. Elapsed times are free of leap second errors.
gregorian_nls
Reference timestamp expressed in Gregorian calendar with generic time system having 86400 seconds per day based at longitude 0 degrees. There is no exact conversion available between this and other calendars.


And, of course, there are the proleptic_gregorian, julian, 360_day, etc calendars, all of which use the generic NLS time system, and none of which can be exactly converted to other calendars.

If we changed the definition of the 'gregorian' calendar to be the same as the 'gregorian_nls' calendar above and included some text warning people that resolutions of < 1 minute are suspect, then we'd have a backward compatibility path. In fact, we could just use 'gregorian' to cover the 'gregorian_utc_lse' case.

What do you think?
Best wishes

Jonathan
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
Grace and peace,

Jim
--
CICS-NC <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information <http://ncdc.noaa.gov/>
/formerly NOAA’s National Climatic Data Center/
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900

/Connect with us on Facebook for climate <https://www.facebook.com/NOAANCEIclimate> and ocean and geophysics <https://www.facebook.com/NOAANCEIoceangeo> information, and follow us on Twitter at @NOAANCEIclimate <https://twitter.com/NOAANCEIclimate> and @NOAANCEIocngeo <https://twitter.com/NOAANCEIocngeo>. /


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to