Hi Jim,

I think partly it's a problem that mechanisms for aggregation are not 
standardized, or understood by tool users.  This can lead to odd behaviour when 
some variables are aggregated.  Maybe the tools are badly coded in some cases, 
or maybe there isn't an obvious correct behaviour for the tool to follow, but 
in either case that's not the fault of the data creator.  Some guidance might 
help to flag the unintended consequences of certain actions, of which the data 
creators may not be aware.

I agree it's not a flaw in the valid_min/max concept per se, but this shouldn't 
stop us from providing "best practice" guidance.

Best wishes,
Jon

From: Jim Biard [mailto:[email protected]]
Sent: 09 July 2013 14:54
To: [email protected] List
Cc: Jon Blower
Subject: Re: [CF-metadata] valid_min and valid_max considered harmful?

Jon,

I appreciate the frustration of finding such problems, but isn't this more a 
problem of lazy processing than a flaw in the valid min/max  concept?

Grace and peace,

Jim

Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites<http://www.cicsnc.org/>
Remote Sensing and Applications Division
National Climatic Data Center<http://www.ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801-5001

[email protected]<mailto:[email protected]>
828-271-4900


[cid:[email protected]]
Follow us on Facebook<https://www.facebook.com/cicsnc>!

On Jul 9, 2013, at 8:42 AM, Jon Blower 
<[email protected]<mailto:[email protected]>> wrote:


Hi all,

On very numerous occasions, I have found problems with datasets where the 
valid_min and valid_max attributes are not set correctly, either because the 
original data files are wrong, or because some processing chain or aggregation 
machinery has resulted in incorrect values.  This is a particular problem in 
time coordinate arrays.

In my experience, these occasions have outweighed the number of times when 
these attributes are actually useful - in most cases the user only has one 
missing value and this should be recorded as a _FillValue, as in section 2.5.1 
of the CF documentation, or does not have a missing value at all.

I think this happens because data producers (with good intentions) feel obliged 
to populate their NetCDF files with as much metadata as possible and end up 
specifying some attributes that don't provide much value for their data.  Is it 
worth adding some text to the CF docs to say something along the lines of:

"The attributes valid_min, valid_max and valid_range should only be used when 
necessary [or should be used with caution], as they  can cause unexpected 
behaviour in situations such as aggregation.  If only one missing value is 
needed for a variable then we recommend strongly that this value be specified 
using the _FillValue attribute. "

The second sentence is already present in the standard.  We may need to define 
what "when necessary" means...

Cheers,
Jon

--
Dr Jon Blower
Technical Director, Reading e-Science Centre
School of Mathematical and Physical Sciences
University of Reading, UK
Tel: +44 (0)118 378 5213
Mob: +44 (0)7919 112687
http://www.resc.reading.ac.uk


_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

<<inline: image001.png>>

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to