I agree wholeheartedly with Steve!

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

[email protected]
828-271-4900

On Mar 28, 2013, at 11:54 AM, Steve Hankin <[email protected]> wrote:

> ... and to remind us of the road we're considering taking
> 
> netCDF files are in every sense "binary" files.  They cannot be read except 
> by custom-built utilities.  (Or is there a constituency that wants to access 
> CF using the unix "strings" command?)  In all cases except the present 
> discussion, it is the job of those custom-built utilities to generate 
> formatted string representations of the information contained in the CF 
> binary encoded variables.
> 
> The entire current discussion would not be happening, if the custom-built 
> utilities and standard code libraries supported the ability to get time 
> information into and out of our binary files using formatted ISO 8601 
> strings.  As the saying goes, if the only tool you have is a hammer, then 
> everything looks to you like a nail.  This email forum is for discussing 
> changes to the CF standard (our hammer), so we are hammering away at the 
> current need (interoperability with ISO 8601).  There is no doubt in my mind 
> that it is the wrong tool for the task.
> 
>     - Steve
>   
> =========================================
> 
> On 3/28/2013 7:49 AM, Jim Biard wrote:
>> Hi.
>> 
>> The format of the string is not what is being described.  That can be 
>> described by the documentation (be it CF, ISO, or a combination).  So what 
>> is it that we are trying to describe, apart from questions of format?  
>> (Expanding on Chris's previous mention, a user-defined type that was a 
>> structure with elements for year, month, day, hour, minute, and second could 
>> be used instead of a string.)
>> 
>> It seems to me that we are trying to figure out how to denote that a 
>> variable contains a "non-arithmetic" expression of time, similar to "degree 
>> minute second hemisphere" representations of latitude and longitude.  
>> (Non-arithmetic may be a poor way of expressing what I mean.  I'm trying to 
>> say that you can't just take two values and add or subtract them in an 
>> atomic operation.)  You can represent such values in strings, but you can 
>> also represent them by packing them into long integers (to millisecond 
>> accuracy).  The question of whether or not this is a wise thing to do is 
>> something else altogether.
>> 
>> I see no reason to exclude the use of the units attribute to denote that the 
>> values  are expressions of time in which the time since the epoch has been 
>> diced up into years, months, days, hours, minutes, and seconds (with varying 
>> precision indicated by omission of finer resolution elements).  Our current 
>> use of the units attribute for time does more than just specify the units 
>> (days vs hours, etc).  What are the units for such a non-arithmetic time 
>> value?  They are complex.  We could specify something like "years months 
>> days" (in the case of a variable that contained dates only), or we could 
>> specify something like "datetime".  When you went to the units table to find 
>> out datetime means, you would find a description. 
>> 
>> As far as that goes, I can see a valid argument for declaring a new standard 
>> name to use for such variables.  If we had a standard name "date" or 
>> "datetime", we could use this to differentiate between arithmetic and 
>> non-arithmetic time expressions.  The units attribute could then express 
>> which elements were present in the representation, or such variables could 
>> be considered to have no units.  We could also specify that variables with a 
>> standard name of "date" (for example) must be of string type.  (This also 
>> has a side benefit - at least to some - of preventing such variables from 
>> being used as time axes.)
>> 
>> In all these cases, the calendar attribute is critical to placing the values 
>> into a reference frame, and must be included.
>> 
>> Regarding Roy's alternatives, I get serious heartburn when considering 1) 
>> and 2).  The long name is not supposed to be a place where machines would go 
>> to get information about how to interpret the contents of a variable.  
>> Everybody seems to want to encroach on it lately.  Similarly, the calendar 
>> attribute has a specific role, which is to identify the reference frame for 
>> the time information.  Adding type/units information to this attribute just 
>> muddies the water even further.
>> 
>> As far as alternative 3 goes, I have no problem with adding one or more 
>> attributes to such variables if it helps clarify something for posterity, 
>> but I think we must still resolve what to do with standard name and units 
>> for such variables.
>> 
>> Having thought through all of that, I am leaning towards using a standard 
>> name of "date" or "datetime" (and use of units, etc as described above) if 
>> we are going to add non-arithmetic expressions of time to CF.  I would 
>> prefer that we stick with the current restriction that the storage format 
>> for times be numeric (that is, in essence, what we currently have), and 
>> leave the question of representation formats up to other layers, but I 
>> understand the desire to have a way to store human-readable dates/times that 
>> would be consistent across files.
>> 
>> I've had many headaches maintaining a proprietary legacy software base (not 
>> netCDF-related) that didn't separate storage and representation formats 
>> because of the amount of code that was needed handle all of the 
>> cross-conversions.
>> 
>> Grace and peace,
>> 
>> Jim
>> 
>> Jim Biard
>> Research Scholar
>> Cooperative Institute for Climate and Satellites
>> Remote Sensing and Applications Division
>> National Climatic Data Center
>> 151 Patton Ave, Asheville, NC 28801-5001
>> 
>> [email protected]
>> 828-271-4900
>> 
>> On Mar 28, 2013, at 5:48 AM, "Lowry, Roy K." <[email protected]> wrote:
>> 
>>> Dear All,
>>> 
>>> I think Chris has hit the nail on the head here.  In my view neither the 
>>> Standard Name nor the units of measure are the way to describe what is in 
>>> essence the format of a string.  So, what other options are there open to 
>>> us?  I can see three alternatives:
>>> 
>>> 1) Use the long name to describe the string format (not just the standard 
>>> used but the profile)
>>> 2) Use the existing calendar attribute
>>> 3) Specify a suitable extension to CF to do the job.
>>> 
>>> These are roughly in my order of preference.
>>> 
>>> Cheers, Roy.
>>> 
>>> ________________________________________
>>> From: CF-metadata [[email protected]] On Behalf Of Chris 
>>> Barker - NOAA Federal [[email protected]]
>>> Sent: 27 March 2013 15:56
>>> Cc: [email protected]
>>> Subject: Re: [CF-metadata] New standard name: datetime_iso8601 
>>> (standard_name or units?)
>>> 
>>> On Wed, Mar 27, 2013 at 8:05 AM, Steve Hankin <[email protected]> 
>>> wrote:
>>> 
>>>> ISO date-time strings are a way of encoding the physical quantity
>>>> that we know as TIME.   So TIME is the "right" standard_name for ISO
>>>> date-time strings per the definition quoted above.
>>>> 
>>>> Now, it may be that there is a compelling argument to violating the normal
>>>> definition of standard_name for the case of ISO date-time strings.  Or on
>>>> the other hand is it preferable to use the units attribute to indicate the
>>>> use of an ISO date-time string?
>>> 
>>> An ISO string for a datetime is not a name (it's still time), but it
>>> is not a unit either.
>>> 
>>> What it is is a data type -- more akin to a float or integer -- i.e. a
>>> particular way to translate bytes to a value. The bytes are a char
>>> array, and the value is the datetime itself.
>>> 
>>> I don't know if thinking about it this way is helpful, as we are
>>> building on netcdf, and I don't now that netcdf allows you to define
>>> new data types, but food for thought.
>>> 
>>> Also, of course, all the other data types in netcdf (and CF) are
>>> direct translations to commonly used binary formats in computers, and
>>> this one is not.
>>> 
>>> hmm -- a quick peak at the netcdf4 docs says:
>>> 
>>> "The richer enhanced model supports user-defined types and data structures"
>>> 
>>> So maybe this could be a user defined type?
>>> 
>>> Having said that, I don't support using ISO strings to define
>>> datetimes in CF. I understand particular use-cases, like keeping the
>>> original time stamp from a data collection system and the like, but
>>> then maybe it's really just arbitrary auxiliary text information, in
>>> which case maybe we don't need a standard name or custom data types at
>>> all.
>>> 
>>> -Chris
>>> 
>>> 
>>> 
>>> --
>>> 
>>> Christopher Barker, Ph.D.
>>> Oceanographer
>>> 
>>> Emergency Response Division
>>> NOAA/NOS/OR&R            (206) 526-6959   voice
>>> 7600 Sand Point Way NE   (206) 526-6329   fax
>>> Seattle, WA  98115       (206) 526-6317   main reception
>>> 
>>> [email protected]
>>> _______________________________________________
>>> CF-metadata mailing list
>>> [email protected]
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>> 
>>> This message (and any attachments) is for the recipient only. NERC is 
>>> subject to the Freedom of Information Act 2000 and the contents of this 
>>> email and any reply you make may be disclosed by NERC unless it is exempt 
>>> from release under the Act. Any material supplied to NERC may be stored in 
>>> an electronic records management system.
>>> _______________________________________________
>>> CF-metadata mailing list
>>> [email protected]
>>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>> 
>> 
>> 
>> _______________________________________________
>> CF-metadata mailing list
>> [email protected]
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
> 

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to