Hi Ken,
As hoped, Jonathan, has already responded. I'm off on a tangent
here ...
I want here to comment on a
wee (and admittedly debatable)
side metadata issue -- the proper use of the "
long_name"
attribute. The
long_name is typically used as the source
of a title string for plots and listings. My view is that a long
name such as "Objectively Analyzed Mean", which names a statistic,
but does not name the underlying parameter, creates a bit of a
documentation risk. No doubt the global '
title' attribute
is expected to fill in the missing context -- stating in this
example that this is a Sea Surface Temperature data set. That is
probably sufficient for most basic plotting situations. But when
one wants to offer automated products, like computed differences
between fields (as LAS does), it can become impractical to carry
along the title string of each dataset used in a calculation. The
number of annotations needed just grows and grows. As Jonathan's
answer has implied, annotations about
cell_methods are
also required.
I guess I am lobbying a viewpoint that the
long_name
attached to each variable should represent a best effort to have
each variable self-document who she is. Thus "Objectively Analyzed
Mean SST" would be preferable to "Objectively Analyzed Mean". Does
this seem reasonable?
- Steve
=======================
On 3/22/2013 10:29 AM, Kenneth S. Casey
- NOAA Federal wrote:
Hi Everyone,
At US NODC we are trying to sort out how to best document a
gridded dataset that contains a number of variables. For
example, we have a sea water temperature gridded dataset, and it
contains 6 variables:
objectively analyzed mean
statistical mean
number of observations
standard deviation
standard error of the mean
'grid points'
We are currently documenting, for example, the objective
analyzed mean temperature variable in this netCDF file like
this:
float t_an(time, depth, lat, lon) ;
t_an:standard_name = "sea_water_temperature" ;
t_an:long_name = "Objectively Analyzed Mean" ;
t_an:comment = "Objectively analyzed
climatologies are the objectively interpolated mean fields for
an oceanographic variable at standard depth levels for the World
Ocean." ;
t_an:cell_methods = "area:mean depth:mean
time:mean" ;
t_an:grid_mapping = "crs" ;
t_an:units = "degrees_celsius" ;
t_an:FillValue = 9.96921e+36f ;
That makes reasonable sense to an application client because
the variable contains a temperature value, so the standard_name
makes sense. Also, cell methods here represent how the data in
the cells are compiled. They do not directly describe the
"thing" in those cells but what kinds of procedures where used
(in this case, the grid cell, with time, lat, lon, and depth
dimensions, is a computed by calculating mean). We think this
is the correct way to represent this particular variable.
But what we should do for the statistical variables is less
clear. We can use standard name modifiers to provide reasonable
standard names, but only four are defined currently:
detection_minimum, number_of_observations, standard_error,
and status_flag
How would we handle the variables like standard deviation?
Right now, we could not provide a standard name with a
modifier, so we'd have to rely on long_name and comment
attributes which is not very satisfactory. We wouldn't want to
use
t_standard_deviation:standard_name =
"sea_water_temperature" ;
because the values in the variable are not sea water
temperature, they are the standard deviation of sea water
temperature. Is the solution to propose some new standard name
modifiers, or are we missing something? This issue seems like
it should be a fairly common problem.
Thanks,
Ken
Kenneth S. Casey,
Ph.D.
Technical Director
NOAA National Oceanographic Data Center
1315 East-West Highway
Silver Spring MD 20910
301-713-3272 x133
http://www.nodc.noaa.gov
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata