Hi all,
After further reflection, I would like to clarify and slightly modify my
proposal. I suggest:
1) CF not try to accommodate folks using "wrong" software.
2) we relax our requirement that udunits be able to handle the time
coordinate because it won't recognize and interpret "UTC" and "GPS".
3) If the units attribute indicates that "UTC" is used for reference
time, this would imply that "UTC" should also be used to record the time
coordinate values. If the units attribute indicates that "GPS" is used
for reference time, this would imply that "GPS" should also be used to
record the time coordinate values.
4) I originally proposed that there be no difference between "GPS" and
the current units attribute. I now see that this won't allow us to
handle all cases. So, I now propose that if "GPS" or "UTC" are not
included as part of the units attribute (as is the case in all current
CF-compliant files), then we should interpret this as meaning that the
data provider has assumed that the calendar and earth orbital period
neglect leap seconds entirely (and that leaving them out doesn't matter
to them). The "no suffix" case would apply to all models I know of and
observational data where differences in sampling time of up to 16
seconds (as of now) can be tolerated.
The above means that when comparing datasets where we want to consider
the state of the system at a particular time of year (i.e., at the same
orbital longitude), we should make sure to correctly compute the
time-stamps from the elapsed time (which under CF is what is stored as a
coordinate variable). Different algorithms would have to be used
depending on whether or not the reference time stored as part of the
units attribute includes the "GPS" suffix or the "UTC" suffix. The
algorithm used to compute the "UTC" timestamp could also be used to
generate the time-stamp for the normal model case which has no leap
seconds (i.e. where the units attribute omits both "GPS" and "UTC").
Once the time-stamps are known, then one could convert these back to
elapsed time using a single approach (say, UTC) for all datasets (using
a common reference time). We would then know how the sampling times
across datasets were staggered, and the sample values could be
interpolated in time to a common set of times. Going to this much
trouble would, of course, be unwarranted in many cases. In fact I think
only if one of the datasets is "GPS" would we ever have to do this.
Even for GPS, if up to a 16 second errors in time could be tolerated,
one could simply take the original time-values for each dataset and
interpolate the data values to common times.
best regards,
Karl
On 6/29/15 11:36 AM, Jim Biard wrote:
Karl,
On 6/29/15 2:00 PM, Karl Taylor wrote:
Hi all,
I haven't followed all this as closely as I would have liked, but
will hazard some comments anyway:
1. I think we should require that elapsed time (recorded by the
time-coordinate in CF files) be correct no matter what the calendar.
So samples taken at a specific interval have identical
time-differences (calculated from the difference between successive
time coordinate values) whether or not the a leap second was
introduced (or dropped?) or not.
I agree, in principle. The way leap seconds worm their way into
elapsed times is by using the "wrong" software when calculating
elapsed times from timestamps. POSIX (that is, Unix/linux) time
conversion functions do not normally account for leap seconds, so if
you use them with UTC times, you put yourself at risk a couple of
different ways.
2. I think we should allow for basetime (the reference time stamp) to
be recorded either as UTC or GPS. Absence of either of these would
imply "GPS", which will therefore apply to all data written until now
under CF.
This is great for real-world acquired times, but model times are in a
non-real time system - neither GPS nor UTC. There is also the
possibility of reference times recorded in the TAI time system, but we
can handle that by adding yet another calendar (and so on for sidereal
time, etc, etc).
3. Folks *generating* CF files should use algorithms that correctly
convert their timestamps to elapsed time (which is recorded in the
files). Then users can regenerate the timestamps correctly by
looking at whether UTC or GPS (or neither) appears.
They should, but many haven't in the past, and the added complexity of
getting UTC time conversion functions that understand leap seconds
(one is going to be added at midnight July 1!) is overkill. When your
time resolution is on the order of 1 hour, none of this matters.
4. Couldn't all of the above be simply accommodated with a single
"gregorian" calendar, but with the basetime (reference time stamp in
the units attribute) including either "UTC" or "GPS" at the end (or
neither for compatibility with previous written CF data)? Examples:
"days since 1990-1-1 0:0:0UTC" or "days since 1950-1-1 0:0:0 GPS"
(which would be equivalent to "days since 1950-1-1 0:0:0"
This would be another way to tackle it. In the past, CF also allowed
for time zone offsets, so you would need to add that as well.
5. Data would not be considered CF-compliant if the elapsed time
(recorded by the time coordinate) were incorrect because it had been
incorrectly converted by the data provider. "UTC" would indicate
that the basetime and conversion of elapsed time to timestamps should
follow the rules of UTC and include leap seconds. "GPS" (or absence
of "UTC" for backward compatibility) would indicate that thebasetime
and conversion of elapsed time to timestamps should follow the rules
of GPS and *not* include leap seconds.
Note that UTC and GPS in the reference times (or in the calendar)
don't say anything about how to create timestamps. It tells you how to
read the reference time. How you produce timestamps from the elapsed
times is up to the data consumer. It is entirely correct to take a
time variable with a reference time in GPS, correctly convert the
reference time to UTC, and then produce UTC timestamps. The
information in the time variable tells me what I have in hand (what
the data producer did), not what I'm supposed to do with it.
I'm sure I have missed some important use case where the above simple
scheme would be inadequate. (Or perhaps I'm just completely out to
lunch, in which case please forgive me).
From my perspective there are two gaps. One is model time, which is
entirely non-physical (has no reference point that ties it into the
real world), and the other is the very real and likely continuing case
where the (in-)precision of the time measurements makes the whole
thing moot.
I'm thinking that Nan's suggestion of specifying an uncertainty for
the elapsed time values (which when left off could imply that
"anything goes" as far as leap seconds is concerned) may be an elegant
way through this.
We could (whether in the calendar or the units) have:
* gregorian, which would not take an uncertainty attribute and would
indicate that the time base and precision are unknown (good for
models and for backward compatibility)
* gregorian + utc, which would take an uncertainty attribute and
would indicate that the reference time stamp is expressed in UTC,
and that elapsed times have no artifacts to the level of the
uncertainty (true UTC with "wrong" conversions but only accurate
to 1 minute would validly fit in this category)
* gregorian + gps, which would be like gregorian + utc, except that
the reference time stamp is expressed in GPS.
Grace and peace,
Jim
best regards,
Karl
On 6/29/15 9:21 AM, Jonathan Gregory wrote:
Dear Tim and Nan
If I have understood correctly, I think your two emails suggest that we do need
a distinction of the precise and imprecise cases. As usual, I believe that CF
should not prescribe to users what they should do; its aim is to allow them to
describe what they have done. Different levels of precision are needed for
different datasets.
Following the emails that Jim and I exchanged, we could distinguish:
gregorian: Real world-times, but without specifying whether UTC or GPS
timestamps are intended, nor whether the encoding was done with or without leap
seconds. The decoded times could differ by several seconds from UTC. I think
this is Nan's use-case.
gregorian_nls: UTC timestamps were encoded without leap seconds, with a
reference UTC timestamp. I think this is Tim's use-case. This is not accurate
according to UTC but it can be decoded precisely as intended. Jim points out
that it's not a real-world calendar, but it's not far off.
Have I correctly described these as your cases?
In addition, we propose two other calendars:
gregorian_utc: The encoded and reference timestamps are UTC, and the encoding
is done with leap seconds allowed for. Hence the time coord is an accurate
elapsed time.
gregorian_gps: The encoded and reference timestamps are GPS, and the encoding
is done without leap seconds. Again, the time coord is accurately elapsed time.
I think this is the use-case which originally started this thread!
Best wishes
Jonathan
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
CICS-NC <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc> *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information
<http://ncdc.noaa.gov/>
/formerly NOAA’s National Climatic Data Center/
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900
/Connect with us on Facebook for climate
<https://www.facebook.com/NOAANCEIclimate> and ocean and geophysics
<https://www.facebook.com/NOAANCEIoceangeo> information, and follow us
on Twitter at @NOAANCEIclimate <https://twitter.com/NOAANCEIclimate>
and @NOAANCEIocngeo <https://twitter.com/NOAANCEIocngeo>. /
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata