... and to remind us of the road we're considering taking
netCDF files are in every sense "binary" files. They cannot be read
except by custom-built utilities. (Or is there a constituency that
wants to access CF using the unix "strings" command?) In all cases
except the present discussion, it is the job of those custom-built
utilities to generate formatted string representations of the
information contained in the CF binary encoded variables.
The entire current discussion would not be happening, if the
custom-built utilities and standard code libraries supported the ability
to get time information into and out of our binary files using formatted
ISO 8601 strings. As the saying goes, if the only tool you have is a
hammer, then everything looks to you like a nail. This email forum is
for discussing changes to the CF standard (our hammer), so we are
hammering away at the current need (interoperability with ISO 8601).
There is no doubt in my mind that it is the wrong tool for the task.
- Steve
=========================================
On 3/28/2013 7:49 AM, Jim Biard wrote:
Hi.
The format of the string is not what is being described. That can be
described by the documentation (be it CF, ISO, or a combination). So
what is it that we are trying to describe, apart from questions of
format? (Expanding on Chris's previous mention, a user-defined type
that was a structure with elements for year, month, day, hour, minute,
and second could be used instead of a string.)
It seems to me that we are trying to figure out how to denote that a
variable contains a "non-arithmetic" expression of time, similar to
"degree minute second hemisphere" representations of latitude and
longitude. (Non-arithmetic may be a poor way of expressing what I
mean. I'm trying to say that you can't just take two values and add
or subtract them in an atomic operation.) You can represent such
values in strings, but you can also represent them by packing them
into long integers (to millisecond accuracy). The question of whether
or not this is a wise thing to do is something else altogether.
I see no reason to exclude the use of the units attribute to denote
that the values are expressions of time in which the time since the
epoch has been diced up into years, months, days, hours, minutes, and
seconds (with varying precision indicated by omission of finer
resolution elements). Our current use of the units attribute for time
does more than just specify the units (days vs hours, etc). What are
the units for such a non-arithmetic time value? They are complex. We
could specify something like "years months days" (in the case of a
variable that contained dates only), or we could specify something
like "datetime". When you went to the units table to find out
datetime means, you would find a description.
As far as that goes, I can see a valid argument for declaring a new
standard name to use for such variables. If we had a standard name
"date" or "datetime", we could use this to differentiate between
arithmetic and non-arithmetic time expressions. The units attribute
could then express which elements were present in the representation,
or such variables could be considered to have no units. We could also
specify that variables with a standard name of "date" (for example)
must be of string type. (This also has a side benefit - at least to
some - of preventing such variables from being used as time axes.)
In all these cases, the calendar attribute is critical to placing the
values into a reference frame, and must be included.
Regarding Roy's alternatives, I get serious heartburn when considering
1) and 2). The long name is not supposed to be a place where machines
would go to get information about how to interpret the contents of a
variable. Everybody seems to want to encroach on it lately.
Similarly, the calendar attribute has a specific role, which is to
identify the reference frame for the time information. Adding
type/units information to this attribute just muddies the water even
further.
As far as alternative 3 goes, I have no problem with adding one or
more attributes to such variables if it helps clarify something for
posterity, but I think we must still resolve what to do with standard
name and units for such variables.
Having thought through all of that, I am leaning towards using a
standard name of "date" or "datetime" (and use of units, etc as
described above) if we are going to add non-arithmetic expressions of
time to CF. I would prefer that we stick with the current restriction
that the storage format for times be numeric (that is, in essence,
what we currently have), and leave the question of representation
formats up to other layers, but I understand the desire to have a way
to store human-readable dates/times that would be consistent across files.
I've had many headaches maintaining a proprietary legacy software base
(not netCDF-related) that didn't separate storage and representation
formats because of the amount of code that was needed handle all of
the cross-conversions.
Grace and peace,
Jim
Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001
[email protected] <mailto:[email protected]>
828-271-4900
On Mar 28, 2013, at 5:48 AM, "Lowry, Roy K." <[email protected]
<mailto:[email protected]>> wrote:
Dear All,
I think Chris has hit the nail on the head here. In my view neither
the Standard Name nor the units of measure are the way to describe
what is in essence the format of a string. So, what other options
are there open to us? I can see three alternatives:
1) Use the long name to describe the string format (not just the
standard used but the profile)
2) Use the existing calendar attribute
3) Specify a suitable extension to CF to do the job.
These are roughly in my order of preference.
Cheers, Roy.
________________________________________
From: CF-metadata [[email protected]
<mailto:[email protected]>] On Behalf Of Chris Barker
- NOAA Federal [[email protected] <mailto:[email protected]>]
Sent: 27 March 2013 15:56
Cc: [email protected] <mailto:[email protected]>
Subject: Re: [CF-metadata] New standard name: datetime_iso8601
(standard_name or units?)
On Wed, Mar 27, 2013 at 8:05 AM, Steve Hankin
<[email protected] <mailto:[email protected]>> wrote:
ISO date-time strings are a way of encoding the physical quantity
that we know as TIME. So TIME is the "right" standard_name for ISO
date-time strings per the definition quoted above.
Now, it may be that there is a compelling argument to violating the
normal
definition of standard_name for the case of ISO date-time strings.
Or on
the other hand is it preferable to use the units attribute to
indicate the
use of an ISO date-time string?
An ISO string for a datetime is not a name (it's still time), but it
is not a unit either.
What it is is a data type -- more akin to a float or integer -- i.e. a
particular way to translate bytes to a value. The bytes are a char
array, and the value is the datetime itself.
I don't know if thinking about it this way is helpful, as we are
building on netcdf, and I don't now that netcdf allows you to define
new data types, but food for thought.
Also, of course, all the other data types in netcdf (and CF) are
direct translations to commonly used binary formats in computers, and
this one is not.
hmm -- a quick peak at the netcdf4 docs says:
"The richer enhanced model supports user-defined types and data
structures"
So maybe this could be a user defined type?
Having said that, I don't support using ISO strings to define
datetimes in CF. I understand particular use-cases, like keeping the
original time stamp from a data collection system and the like, but
then maybe it's really just arbitrary auxiliary text information, in
which case maybe we don't need a standard name or custom data types at
all.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
[email protected] <mailto:[email protected]>
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
This message (and any attachments) is for the recipient only. NERC is
subject to the Freedom of Information Act 2000 and the contents of
this email and any reply you make may be disclosed by NERC unless it
is exempt from release under the Act. Any material supplied to NERC
may be stored in an electronic records management system.
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata