There are two standard names of the form histogram_of_..... in the CF Standard
Name list (at version 36):
Both of these where used in CMIP5 and set to be used in CMIP6, but the usage
does not appear to match the standard name desecriptions.
The possible confusion is over the role of different coordinates. The CF
definitions say ''"histogram_of_X[_over_Z]" means histogram (i.e. number of
counts for each range of X) of variations (over Z) of X.' This implies to me
that you start with a function of Z and possibly other coordinates and end up
with a function of X and the other coordinates. E.g. if the source data is
X(lat,lon,Z), then the histogram data will be of the form frequency(lat,lon,X).
In the two CMIP5/CMIP6 draft variables (cfadLidarsr532, cfadDbze94) using these
standard names the "Z" coordinate which is included in the standard name
("height_above_reference_ellipsoid") is one of the coordinates of the histogram
data variable. Both these variables appear to be joint distributions (frequency
of X and Y values) over sub-grid variability as a function of latitude,
longitude and time.
I've been reviewing these existing definitions in some detail because there are
some new distribution variables in the request and I'd like to make sure that
we have a consistent approach.
If we need to described a variable which carries a joint distribution of X and
Y, then the variable will have to use X and Y as coordinates, so perhaps we can
simplify the process by leaving them out of the standard name. Similarly the
"over_Z" part of the name would be better expressed as a cell_methods
construct. This line of reasoning suggests using a new standard name such as
"frequency_distribution" (units "1"). The only difficulty is that the frequency
distribution might be a function of the quantities X and Y (scattering ratio
and cloud top height for cfadLidarsr532) and also of latitude, longitude and
time. There should be some way of distinguishing the different roles of these 5
coordinates: is is the distribution of X and Y as a function of latitude,
longitude and time. I think this could be done conveniently by introducing a
single new attribute, e.g. "bin_coords: X Y".
"frequency_distribution" could be used for single or joint distributions.
My questions to the list are:
(1) am I missing something in my interpretation of the existing
(2) if not, is the adoption of a "frequency_distribution" standard name an
appropriate way forward?
CF-metadata mailing list