Dear Jonathan,

I think we will make it more difficult on users to write and interpret CF data if the calendar attribute is too complicated and the meaning of "gregorian" changes from what it meant in the past. In the past we got away with a single "gregorian" option, and I suspect all the CF-compliant model output and nearly all the observational data stored under CF would be correctly interpreted if the definition of "gregorian" included the following sentences:

----------------------------------
Under the "gregorian" calendar the length of the solar day can be assumed to be exactly 86400 seconds long (i.e., there are no leap seconds). This means that for models where this assumption almost invariably is valid, conversion from elapsed time to clock time is straight-forward and exact, whereas for observations, conversion to clock time may introduce errors as large as 16 seconds because it is unknown whether the UTC or GPS time system has been used in specifying the reference times (appearing in the time units attribute), and it is also unknown whether leap seconds have been properly accounted for in converting UTC clock times to elapsed time.
----------------------------------

Interpreted as above, the "gregorian" calendar would make it possible for users to invariably decode *model* output and not encounter the problems you discussed in your first paragraph. Of course with *observations*, they might encounter such problems, but that's because the observationalist storing the data is apparently o.k. with errors of up to 16 seconds (otherwise they would rewrite their data with one of the newly proposed calendars specifying UTC or GPS).

In the future, I think we should interpret "gregorian" the same as we have in the past, but we would also offer two new calendars (gregorian_utc and gregorian_gps) for those who need to indicate that their reference times are defined by a specific time system, and one more calendar (gregorian_utc_nls) for those who choose not to properly account for leap seconds in converting from UTC clock time to elapsed time. These new calendars would mostly be used for observations, but conceivably there might be a model initialized from observations (and subsequently compared against observations perhaps only a few seconds later), where one would want to precisely record whether the reference time (included in the units) follows the UTC or GPS time system, just as in the observational data set it is being compared with. In these (rare) cases, the calendar would be indicated as being either gregorian_utc or gregorian_gps for the model output, just as in the observational data set.

You argue that interpretation of "gregorian" depends on whether it describes observations or model output. That's true, and apparently that has always been the case. We can't change that, and why should we change it going forward?

I don't see a case for including gregorian_nls (for models), unless we decide to redefine "gregorian" to mean:

"a calendar that: 1) might or might not account for leap seconds, 2) might or might not assume the length of the solar day is exactly 86400 seconds long, and 3) might express the reference time according to either UTC or GPS"

This definition would also be consistent with past usage of "gregorian" but would make virtually all the model data stored already under CF with calendar="gregorian" seem to be imprecise in specifying the time-coordinates, even though the coordinates are in fact defined such that they can be converted to wall clock time assuming the solar day is exactly 86400 seconds long. If you want to adopt this alternative definition (rather than the one I suggest in the 2nd paragraph above), then we should probably introduce "gregorian_nls" as a calendar/time system for which the length of the solar day is exactly 86400 seconds long". In the future gregorian_nls would probably be used (instead of "gregorian") in all but a few model-produced datasets.

best regards,
Karl


On 7/14/15 10:48 AM, Jonathan Gregory wrote:
Dear Karl

Thank you for your useful summary, which I think is quite right. That will
provide some good text for the standard document.

You suggest merging gregorian_nls (for models, exactly 86400-second days)
into gregorian (imprecise about which calendar is used and how encoded),
distinguishing them according to whether the data is model or observational.

I'm not comfortable with that. I can't think of another case in CF where the
metadata is designed to be interpreted differently for models and observations,
and it would not be easy to do, because there's no metadata that is guaranteed
to be present in a standard form to tell you if it's model or observational.
Yet I think this distinction must be made. It would not be satisfactory if
users interpreted the imprecision of "gregorian" to mean they could decode
model data e.g. from CMIP using the UTC calendar, and found days that appear to
start 16 seconds different from midnight. I am sure this would cause problems
e.g. wrong months selected. That's why I think we need gregorian_nls as a model
calendar, to be used instead of gregorian in future where applicable. We need
to be able to assert that the 86400-second day definitely applies.

I agree with Jim that there is a distinction between gregorian_utc_nls and
gregorian too. Some people supplying observational data don't require the
precision of specifying UTC (or GPS), so they don't want to choose gregorian_
utc or gregorian_gps. Nan argued this case. Others however may wish to be
precise about UTC timestamps, but choose to encode it without leap seconds.
So I think we need the meaning of gregorian_utc_nls.

However, on reflection I convinced myself (at least! - but not Jim) that the
distinction between gregorian_nls (for models) and gregorian_utc_nls (for
the real world) is too subtle to make reliably, so I suggested we should use
gregorian_nls for both, and say that *if* it is observational data, it must
be UTC. That's not quite the same as your suggestion, because the timestamps
can be exactly recovered without knowing if it's model or observational, but
you would need to know in order to tell whether the elapsed times are accurate
(as they are for model data) or perhaps not accurate (for real world data).
Whereas I regard timestamps as more important, Jim tends to regard elapsed
times as more important, so I guess this second issue would count more for him.
If it is crucial, then we need both gregorian_nls and gregorian_utc_nls. The
distinction is whether it is model or real-world time. My concern is that when
when models are used to simulate events that happened in real-world time, data-
producers may often find it hard to decide between these alternatives, and it's
unclear whether it's useful to do so anyway.

Best wishes

Jonathan
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to