Hi all,

On very numerous occasions, I have found problems with datasets where the 
valid_min and valid_max attributes are not set correctly, either because the 
original data files are wrong, or because some processing chain or aggregation 
machinery has resulted in incorrect values.  This is a particular problem in 
time coordinate arrays.

In my experience, these occasions have outweighed the number of times when 
these attributes are actually useful - in most cases the user only has one 
missing value and this should be recorded as a _FillValue, as in section 2.5.1 
of the CF documentation, or does not have a missing value at all.

I think this happens because data producers (with good intentions) feel obliged 
to populate their NetCDF files with as much metadata as possible and end up 
specifying some attributes that don't provide much value for their data.  Is it 
worth adding some text to the CF docs to say something along the lines of:

"The attributes valid_min, valid_max and valid_range should only be used when 
necessary [or should be used with caution], as they  can cause unexpected 
behaviour in situations such as aggregation.  If only one missing value is 
needed for a variable then we recommend strongly that this value be specified 
using the _FillValue attribute. "

The second sentence is already present in the standard.  We may need to define 
what "when necessary" means...

Cheers,
Jon

--
Dr Jon Blower
Technical Director, Reading e-Science Centre
School of Mathematical and Physical Sciences
University of Reading, UK
Tel: +44 (0)118 378 5213
Mob: +44 (0)7919 112687
http://www.resc.reading.ac.uk


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to