Dear Martin, You are right, those definitions are not correct.

> From your reply I understand now that these are univariate distributions > giving the > frequency of different radar reflectivities in different height bands. Coming > from > radar/lidar instruments (or an emulator of these instruments), there are > multiple > observations in each GCM-scale height band. Presumably, there are also > multiple > profiles in the GCM-scale grid square, so that we have a frequency > distribution over > sub-grid scale variability in the vertical and the horizontal? Or is it > actually evaluated > at a spatial point? > There is a sub-grid distribution of vertical profiles from which they are constructed. The definition that you propose seems accurate to me. Thanks again for your time spent clarifying this. Regards, Alejandro > -----Original Message----- > From: CF-metadata [mailto:cf-metadata-boun...@cgd.ucar.edu] On Behalf Of > martin.juc...@stfc.ac.uk > Sent: 13 October 2016 13:05 > To: cf-metadata@cgd.ucar.edu > Subject: [CF-metadata] Usage of histogram_of_X_over_Z > > Dear Alejandro, > > The two CMIP variables which I'm talking about are cfadDbze94 currently > defined > as "CFAD (Cloud Frequency Altitude Diagrams) are joint height - radar > reflectivity > (or lidar scattering ratio) distributions." and cfadLidarsr532, which has the > same > definition. If they are not joint distributions we clearly have a problem > with these > definitions. > > From your reply I understand now that these are univariate distributions > giving the > frequency of different radar reflectivities in different height bands. Coming > from > radar/lidar instruments (or an emulator of these instruments), there are > multiple > observations in each GCM-scale height band. Presumably, there are also > multiple > profiles in the GCM-scale grid square, so that we have a frequency > distribution over > sub-grid scale variability in the vertical and the horizontal? Or is it > actually evaluated > at a spatial point? > > If this is the case, you are right and we just need to correct the > definitions in the > CMIP tables (though there is still a case for introducing a > frequencs_distribution for > other variables, but that should ne another thread). I would favour a > slightly more > verbose and explicit definition, e.g. > "CFAD (Cloud Frequency Altitude Diagrams) are frequency distributions of radar > reflectivity (or lidar scattering ratio) as a function of altitude. > cfadDbze94 is defined > as the simulated relative frequency of radar reflectivity in sampling volumes > defined > by altitude bins and model grid cells." > > Note that I'm using "altitude" rather than "height" to match the standard > names: in > the CF Convention, "altitude" means height above the geoid, and "height" means > height above the surface. > > Is that an accurate definition? > > regards, > Martin > > > Dear Martin, > > Thanks for your detailed explanation. I'd like to add a bit more information. > These > variables are not joint distributions, they are 1D distributions for > different ranges of Z. > The question is, does "histogram_of_X[_over_Z]" mean that the Z coordinate > has to > be completely collapsed? It is not clear to that the current definition > implies that. If Z > is not completely collapsed, you can then end up with a function of the form > frequency(lat,lon,X,Z2), where the coordinate Z is only partially collapsed > into bins > described by Z2. I'm using here Z2 to explicitly show when the Z coordinate > represents bins. This would look like a joint histogram, but it is not. I > think that your > proposal of dropping "_over_Z" from the standard name works for a joint > distribution, but not for a collection of 1D distributions along Z, unless > there is a way > of distinguishing between both cases with the use of attributes. > > Another detail is that these histograms provide relative frequencies (values > between > 0 and 1, not counts), not absolute frequencies. Is that inconsistent with the > current > definition of histogram in CF? > > Regards, > > Alejandro > > > -----Original Message----- > > From: martin.juckes at > stfc.ac.uk<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > [mailto:martin.juckes at > stfc.ac.uk<http://mailman.cgd.ucar.edu/mailman/listinfo/cf- > metadata>] > > Sent: 12 October 2016 19:05 > > To: cf-metadata at > > cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf- > metadata> > > Cc: Bodas-Salcedo, Alejandro > > Subject: Usage of histogram_of_X_over_Z > > > > Hello, > > > > There are two standard names of the form histogram_of_..... in the CF > > Standard > > Name list (at version 36): > > histogram_of_backscattering_ratio_over_height_above_reference_ellipsoid and > > > histogram_of_equivalent_reflectivity_factor_over_height_above_reference_ellipsoid > > . Both of these where used in CMIP5 and set to be used in CMIP6, but the > > usage > > does not appear to match the standard name desecriptions. > > > > The possible confusion is over the role of different coordinates. The CF > > definitions > > say ''"histogram_of_X[_over_Z]" means histogram (i.e. number of counts for > > each > > range of X) of variations (over Z) of X.' This implies to me that you start > > with a > > function of Z and possibly other coordinates and end up with a function of > > X and > the > > other coordinates. E.g. if the source data is X(lat,lon,Z), then the > > histogram data > will > > be of the form frequency(lat,lon,X). > > > > In the two CMIP5/CMIP6 draft variables (cfadLidarsr532, cfadDbze94) using > these > > standard names the "Z" coordinate which is included in the standard name > > ("height_above_reference_ellipsoid") is one of the coordinates of the > > histogram > data > > variable. Both these variables appear to be joint distributions (frequency > > of X and > Y > > values) over sub-grid variability as a function of latitude, longitude and > > time. > > > > I've been reviewing these existing definitions in some detail because there > > are > some > > new distribution variables in the request and I'd like to make sure that we > > have a > > consistent approach. > > > > If we need to described a variable which carries a joint distribution of X > > and Y, > then > > the variable will have to use X and Y as coordinates, so perhaps we can > > simplify > the > > process by leaving them out of the standard name. Similarly the "over_Z" > > part of > the > > name would be better expressed as a cell_methods construct. This line of > reasoning > > suggests using a new standard name such as "frequency_distribution" (units > > "1"). > > The only difficulty is that the frequency distribution might be a function > > of the > > quantities X and Y (scattering ratio and cloud top height for > > cfadLidarsr532) and > also > > of latitude, longitude and time. There should be some way of distinguishing > > the > > different roles of these 5 coordinates: is is the distribution of X and Y > > as a function > of > > latitude, longitude and time. I think this could be done conveniently by > > introducing > a > > single new attribute, e.g. "bin_coords: X Y". > > > > "frequency_distribution" could be used for single or joint distributions. > > > > My questions to the list are: > > (1) am I missing something in my interpretation of the existing > > histogram_of_... > > names? > > (2) if not, is the adoption of a "frequency_distribution" standard name an > appropriate > > way forward? > > > > regards, > > Martin > > > > regards, > > Martin