Hi Jim,

There are a lot of nails being hit on the head at the moment.  The Standard 
Name attribute was conceived as a standardised label for the geophysical 
phenomenon - a sort of grouping term for what was being measured.  Note that 
the Standard Name isn't a mandatory attribute in CF - the rules state that 
there either needs to be a long name OR a Standard Name!

Over time there have been various attempts to turn the Standard Name into 
something different - the 'single location to gather important information 
about the kind of data contained in a variable' as you put it (I like that 
description).  For example, the OceanSites community made the Standard Name 
mandatory in their CF profile.  This caused requests to be made for new 
Standard Names appropriate to your definition, but not the original Standard 
Name concept.  Some of these got through: others didn't, which makes the entity 
definition of the Standard Name concept a little blurred.

Another strategy has been to provide a 'signpost' pointing out the location of 
all the various bits of information needed to fulfil your definition.  There 
was a proposal called Common Concept (Trac ticket 24) designed to do this.  
Unfortunately, it required a small but significant amount of effort to set up 
that was never resourced. In fact, it seemed like it was cursed - I even had to 
hand back funding allocated for the purpose because a critical staff member 
left at a time when there was a total ban on UK public service recruitment. A 
'resource light' version of this strategy - CF String Syntax (Trac ticket 94 
which I'm moderating) - is currently ready to implement once it has been 
written up as a Conventions document update.  If we get this finished do you 
think it would resolve the problems you see?

Cheers, Roy.


Because it isn't mandatory in CFPlease note that I now work part-time from 
Tuesday to Thursday.  E-mail response on other days is possible but not 
guaranteed!

From: CF-metadata [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of Jim 
Biard
Sent: 02 April 2013 21:19
To: cf-metadata@cgd.ucar.edu
Cc: Jonathan Gregory
Subject: Re: [CF-metadata] interplay of standard name modifiers, cell_methods 
-- is there a problem?

Jonathan,

You haven't been unclear about how we got into the state we are currently in.  
We got to now by adding bits here and there as needs arose without always 
thinking about the implications for the whole system down the road (which you 
did a great job of describing).  A lot of good work has been done to get where 
we are today, and I appreciate that.  I also think that there is room for 
improvement in how we represent values that result from operations applied to 
measurements.

Yes, we could add paragraphs to the documentation to tell users to go look in 
various places when they are trying to figure out what the contents of a 
variable are, but that is not user-friendly behavior.  I want to make it easy 
and intuitive to understand what sort of information a variable contains.  If 
we consider the standard name attribute as the location where the essence of 
the variable contents is described using a controlled vocabulary, it makes 
sense to provide a mechanism within that vocabulary for distinguishing between 
a direct measurement (air temperature) and information about a measurement 
(standard deviation of air temperature).  We provide this for some operations 
on measurements (number of observations, standard error, etc), but not for 
others (standard deviation, variance, anomaly, etc).

Expanding the list of standard name modifiers in the CF Metadata Conventions 
would allow us to make variables more self-describing and less confusing, and 
allow a user (or software) to look in a single location to gather important 
information about the kind of data contained in a variable (which I see as the 
purpose of the standard_name attribute).

Grace and peace,

Jim

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites
Remote Sensing and Applications Division
National Climatic Data Center
151 Patton Ave, Asheville, NC 28801-5001

jim.bi...@noaa.gov<mailto:jim.bi...@noaa.gov>
828-271-4900

On Apr 2, 2013, at 1:05 PM, Jonathan Gregory 
<j.m.greg...@reading.ac.uk<mailto:j.m.greg...@reading.ac.uk>> wrote:


Dear all

Jim asked,

"As some examples of the confusing situation we have now, why do we have a
separate word modifier number_of_observations instead of a
number_of_observations_of_X transformation modifier?  Why don't we have
variance_of_X or anomaly_of_X transformations (or separate word modifiers
variance or anomaly)?  Why isn't there a cell method for standard error?  I
can't discern any logic behind the current partitioning."

I've tried to explain how this came about, but perhaps I am not being clear,
so let me try again:

* We introduced the modifiers like number_of_observations for those situations
where it was thought likely that a large number of standard names would need
them. Factorising out this dimension thus avoids a large expansion of the
standard name table. So far, only four anomaly_of names have been requested,
so it seems the right judgement not to have a standard_name modifier for that.

* That was also one of the motivations for cell_methods: there would be vastly
more standard names if we had to include all the cell_methods information too.
The other motivation for cell_methods is that the statistical operations
relate to particular axes. For instance, just "mean" is too vague: does it
mean time-mean, zonal-mean, mean over radiation wavelength, or what? The same
is true for variance. The cell_methods attribute makes this precise.

* There is not a cell method for standard error because it does not relate to
a particular dimension. The standard error is a metadata property of the
individual data. The cell methods statistically describe the variation of the
quantity within cells. These are different purposes.

While you may not agree with the logic, does this help to explain what it is?

If the situation is perceived as confusing and easily misunderstood, I am all
in favour of clarifying it by inserting more explanation and discussion in the
CF standard document. That could be done with a defect ticket. As Philip says,
it could shorten future discussions.

But we can also change the standard, of course. However, changes to existing
attributes are difficult for existing software. I do not think we need or
ought to change the existing attributes. While I appreciate the reason for the
suggestion, I feel that suffixing something to the standard_name to indicate
"something" has been done to it would not really help, because there is almost
*always* something done to it! Cell methods are recommended to be specified in
any case where the default "point" or "sum" is not correct. They should be
present if the quantity is a mean, in particular. A mean is also a
transformation, just like a standard deviation.

I am not convinced yet by the argument that we have to modify the CF standard
because the standard_name may be misunderstood or misused by software which
catalogues or serves datasets. CF introduced the standard_name attribute. If
it's being used now, software must already have been modified to support CF.
Well then, why can't be modified again to support CF more fully or correctly?
If we explained more clearly in the standard what the intention was, that would
no doubt help with future software design.

Instead of changing what we have, I think we should add to it. It seems to me,
as I've said before, that the existing proposal for "CF strings" summarising
some essential metadata (similar to the earlier proposal for common concepts
in some ways) would solve this problem. It is *that* kind of string, not the
standard name, that the user should be offered to select an appropriate
variable. It's a combination of attributes. It's not hard to assemble that
information from the separate attributes, but if that's an obstacle, we could
help software over it by recommending that this extra attribute be included.

Please have a look at https://cf-pcmdi.llnl.gov/trac/ticket/94 and add your
comments on it.

Best wishes

Jonathan
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu<mailto:CF-metadata@cgd.ucar.edu>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


________________________________
This message (and any attachments) is for the recipient only. NERC is subject 
to the Freedom of Information Act 2000 and the contents of this email and any 
reply you make may be disclosed by NERC unless it is exempt from release under 
the Act. Any material supplied to NERC may be stored in an electronic records 
management system.
_______________________________________________
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to