There are two standard names of the form histogram_of_..... in the CF Standard 
Name list (at version 36): 
histogram_of_backscattering_ratio_over_height_above_reference_ellipsoid and 
 Both of these where used in CMIP5 and set to be used in CMIP6, but the usage 
does not appear to match the standard name desecriptions. 

The possible confusion is over the role of different coordinates. The CF 
definitions say ''"histogram_of_X[_over_Z]" means histogram (i.e. number of 
counts for each range of X) of variations (over Z) of X.' This implies to me 
that you start with a function of Z and possibly other coordinates and end up 
with a function of X and the other coordinates. E.g. if the source data is 
X(lat,lon,Z), then the histogram data will be of the form frequency(lat,lon,X).

In the two CMIP5/CMIP6 draft variables (cfadLidarsr532, cfadDbze94) using these 
standard names the "Z" coordinate  which is included in the standard name 
("height_above_reference_ellipsoid") is one of the coordinates of the histogram 
data variable. Both these variables appear to be joint distributions (frequency 
of X and Y values) over sub-grid variability as a function of latitude, 
longitude and time. 

I've been reviewing these existing definitions in some detail because there are 
some new distribution variables in the request and I'd like to make sure that 
we have a consistent approach. 

If we need to described a variable which carries a joint distribution of X and 
Y, then the variable will have to use X and Y as coordinates, so perhaps we can 
simplify the process by leaving them out of the standard name. Similarly the 
"over_Z" part of the name would be better expressed as a cell_methods 
construct. This line of reasoning suggests using a new standard name such as 
"frequency_distribution" (units "1"). The only difficulty is that the frequency 
distribution might be a function of the quantities X and Y (scattering ratio 
and cloud top height for cfadLidarsr532) and also of latitude, longitude and 
time. There should be some way of distinguishing the different roles of these 5 
coordinates: is is the distribution of X and Y as a function of latitude, 
longitude and time. I think this could be done conveniently by introducing a 
single new attribute, e.g. "bin_coords: X Y".

"frequency_distribution" could be used for single or joint distributions.

My questions to the list are:
(1) am I missing something in my interpretation of the existing 
histogram_of_... names?
(2) if not, is the adoption of a "frequency_distribution" standard name an 
appropriate way forward?


CF-metadata mailing list

Reply via email to