Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Martin Juckes - UKRI STFC
Hi Dan, if we were starting from a blank sheet, that would be a strong point. As it is, we are rather constrained by the existing practices in the community. I hope that we can find an agreement along the lines of the discussion that Jonathan and I are having which makes it possible to

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Hollis, Dan
Hi Martin, I agree there is no clear line between data and metadata and I didn't really intend to suggest there was one. As you say, there are different equally-valid views of where the line could/should be drawn in any particular situation between the different types of data that we wish to

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Martin Juckes - UKRI STFC
Dear Jonathan, Sorry, I think I misunderstood the scope of valid usage of "flag_values". I've only seen it used in contexts in which all values of the flagged array are translated using the "flag_values"/"flag_meanings" pairs, but you are suggesting, I think, that it should only apply to the

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Martin Juckes - UKRI STFC
Hi Dan, Thanks, that makes it clearer. The conversation below follows on from one that Karl and I had with people from CFMIP (Cloud Forcing Model Intercomparison Project). The variable in question, contains the histogram, is produced to make it possible to compare climate model output with

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Hollis, Dan
Hi Martin, Sorry, I didn't mean to imply that we would do away with the histogram standard names - these would be retained, of course. I just meant that we both want to store one extra bit of information (maximum number of obs or, equivalently, missing number of obs) and that in both use cases

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Jonathan Gregory
Dear Martin I agree that if valid_range implies masked-out data in some software, we can't put special values out of the range, and that we shouldn't tamper with missing data. I still think that flag_values is a better way to indicate special values in a coordinate variable than an auxiliary

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Martin Juckes - UKRI STFC
Hi Dan, it is a similar concept, but the aim here is to record it in a histogram. We have a standard name for the histogram .. I'm not sure why you think we need to change this. Perhaps it would be possible to do away with "histogram_" standard names and just use "number_of_observations",

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Hollis, Dan
Hi Martin, Thanks for your suggestion - I can see how this could work for our data. However I can also see that having to parse the 'interval' text from the 'cell_methods' comment field and combine that with the bounds from the time coordinate is not especially user-friendly! It would be much

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Martin Juckes - UKRI STFC
Hello Dan, I think there is a method for recording the number of valid observations in each data point, which, if I've understood correctly, would meet the requirement you are describing: using an "ancillary_variable" with standard name "number_of_observations". I don't think there is a

Re: [CF-metadata] Missing data bins in histograms

2019-05-14 Thread Hollis, Dan
Dear Martin/Jonathan/Jim, I appreciate that this discussion is focussed on histograms, however I wonder if there is a wider issue here i.e. how should one record the number of missing values for any extensive quantity? For example, we use number_of_days_with_air_temperature_below_threshold to