Re: [CF-metadata] Missing data bins in histograms

2016-10-11 Thread Jim Biard

Hi.

Another approach could be to use flag_values and flag_meanings on the 
coordinate variable to indicate one or more special coordinate values 
that correspond to any number of "missing data" or "out of bounds" bins. 
These attributes aren't forbidden by CF, and everything should be fine 
as long as the coordinate variable remains monotonic.


Grace and peace,

Jim

On 10/11/16 8:41 AM, martin.juc...@stfc.ac.uk wrote:

Hello,

the CF standard name list has two "histogram_ " entries, and in the CMIP6 data 
request we may need to add a third, a histogram_of_cloud_top_height. Besides the standard name, we 
also need, for this new variable, a method of encoding the "missing data" bin in the 
histogram. That is, the histogram should record frequency in 16 data bins and one additional bin 
for the frequency of missing data.

Can we define a "missing_data_index" attribute for histogram variables, and use 
this to indicate that the first bin in the array has this special purpose. It might be 
more pythonic to put the _FillValue in the coordinate value for the missing data bin, but 
I suspect that this would cause substantial problems for many software packages.

regards,
Martin
___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


--
CICS-NC  Visit us on
Facebook  *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC 
North Carolina State University 
NOAA National Centers for Environmental Information 
/formerly NOAA’s National Climatic Data Center/
151 Patton Ave, Asheville, NC 28801
e: jbi...@cicsnc.org 
o: +1 828 271 4900

/Connect with us on Facebook for climate 
 and ocean and geophysics 
 information, and follow us 
on Twitter at @NOAANCEIclimate  and 
@NOAANCEIocngeo . /



___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


Re: [CF-metadata] Missing data bins in histograms

2016-10-11 Thread Karl Taylor

Hello,

The histogram records frequencies of a single characteristic of a 
variable (in this case for cloud top height).  I think that information 
about whether or not a cloud exists should not be formally a part of the 
histogram.  We could adopt the convention for this variable that in the 
absence of clouds, the cloud is considered to be "under ground" so the 
upper bound of the height of a missing cloud would be 0.[This is 
akin to Lorenz's definition of the potential temperature isotherms as 
coinciding with the ground in his discussion of available potential energy.]


By the way, I couldn't find this variable in the current release of the 
CMIP6 data request.  Is it there?  If not, could you say a bit more 
about how the bins are defined?  Are they height or pressure bins?


thanks,
Karl

On 10/11/16 5:41 AM, martin.juc...@stfc.ac.uk wrote:

Hello,

the CF standard name list has two "histogram_ " entries, and in the CMIP6 data 
request we may need to add a third, a histogram_of_cloud_top_height. Besides the standard name, we 
also need, for this new variable, a method of encoding the "missing data" bin in the 
histogram. That is, the histogram should record frequency in 16 data bins and one additional bin 
for the frequency of missing data.

Can we define a "missing_data_index" attribute for histogram variables, and use 
this to indicate that the first bin in the array has this special purpose. It might be 
more pythonic to put the _FillValue in the coordinate value for the missing data bin, but 
I suspect that this would cause substantial problems for many software packages.

regards,
Martin
___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


[CF-metadata] Missing data bins in histograms

2016-10-11 Thread Jonathan Gregory
Dear Martin

I feel there would be an advantage in flexibility, by not requiring the missing
data count to be the first bin necessarily. The new attribute could indicate
the index of the bin which contains the missing data count. I suggest that
this would be an attribute of the coordinate variable of the histogram for the
quantity which is binned (cloud top height), since the index refers to that
dimension specifically.

I agree it would be neat if it were possible instead to put _FillValue in the
coordinate variable. Actually _FillValue is not allowed in coordinate vars by
CF, so as far as CF is concerned it would not be a problem to adopt this as a
new convention. But maybe software would have problems with it. If we need the
new attribute, I'd suggest missing_value_index, to make it more similar to
missing_value and _FillValue. What would you put in the coordinate and bounds
for the missing data bin?

In any case, this needs a new convention to be proposed as a trac ticket.

Best wishes

Jonathan

- Forwarded message from martin.juc...@stfc.ac.uk -

> Date: Tue, 11 Oct 2016 12:41:21 +
> From: martin.juc...@stfc.ac.uk
> To: cf-metadata@cgd.ucar.edu
> CC: rojma...@u.washington.edu
> Subject: [CF-metadata] Missing data bins in histograms
> 
> Hello,
> 
> the CF standard name list has two "histogram_ " entries, and in the CMIP6 
> data request we may need to add a third, a histogram_of_cloud_top_height. 
> Besides the standard name, we also need, for this new variable, a method of 
> encoding the "missing data" bin in the histogram. That is, the histogram 
> should record frequency in 16 data bins and one additional bin for the 
> frequency of missing data.
> 
> Can we define a "missing_data_index" attribute for histogram variables, and 
> use this to indicate that the first bin in the array has this special 
> purpose. It might be more pythonic to put the _FillValue in the coordinate 
> value for the missing data bin, but I suspect that this would cause 
> substantial problems for many software packages.
> 
> regards,
> Martin
> ___
> CF-metadata mailing list
> CF-metadata@cgd.ucar.edu
> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

- End forwarded message -
___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


[CF-metadata] Missing data bins in histograms

2016-10-11 Thread martin.juckes
Hello,

the CF standard name list has two "histogram_ " entries, and in the CMIP6 
data request we may need to add a third, a histogram_of_cloud_top_height. 
Besides the standard name, we also need, for this new variable, a method of 
encoding the "missing data" bin in the histogram. That is, the histogram should 
record frequency in 16 data bins and one additional bin for the frequency of 
missing data.

Can we define a "missing_data_index" attribute for histogram variables, and use 
this to indicate that the first bin in the array has this special purpose. It 
might be more pythonic to put the _FillValue in the coordinate value for the 
missing data bin, but I suspect that this would cause substantial problems for 
many software packages.

regards,
Martin
___
CF-metadata mailing list
CF-metadata@cgd.ucar.edu
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata