Hi Jim, I think partly it's a problem that mechanisms for aggregation are not standardized, or understood by tool users. This can lead to odd behaviour when some variables are aggregated. Maybe the tools are badly coded in some cases, or maybe there isn't an obvious correct behaviour for the tool to follow, but in either case that's not the fault of the data creator. Some guidance might help to flag the unintended consequences of certain actions, of which the data creators may not be aware.
I agree it's not a flaw in the valid_min/max concept per se, but this shouldn't stop us from providing "best practice" guidance. Best wishes, Jon From: Jim Biard [mailto:[email protected]] Sent: 09 July 2013 14:54 To: [email protected] List Cc: Jon Blower Subject: Re: [CF-metadata] valid_min and valid_max considered harmful? Jon, I appreciate the frustration of finding such problems, but isn't this more a problem of lazy processing than a flaw in the valid min/max concept? Grace and peace, Jim Jim Biard Research Scholar Cooperative Institute for Climate and Satellites<http://www.cicsnc.org/> Remote Sensing and Applications Division National Climatic Data Center<http://www.ncdc.noaa.gov/> 151 Patton Ave, Asheville, NC 28801-5001 [email protected]<mailto:[email protected]> 828-271-4900 [cid:[email protected]] Follow us on Facebook<https://www.facebook.com/cicsnc>! On Jul 9, 2013, at 8:42 AM, Jon Blower <[email protected]<mailto:[email protected]>> wrote: Hi all, On very numerous occasions, I have found problems with datasets where the valid_min and valid_max attributes are not set correctly, either because the original data files are wrong, or because some processing chain or aggregation machinery has resulted in incorrect values. This is a particular problem in time coordinate arrays. In my experience, these occasions have outweighed the number of times when these attributes are actually useful - in most cases the user only has one missing value and this should be recorded as a _FillValue, as in section 2.5.1 of the CF documentation, or does not have a missing value at all. I think this happens because data producers (with good intentions) feel obliged to populate their NetCDF files with as much metadata as possible and end up specifying some attributes that don't provide much value for their data. Is it worth adding some text to the CF docs to say something along the lines of: "The attributes valid_min, valid_max and valid_range should only be used when necessary [or should be used with caution], as they can cause unexpected behaviour in situations such as aggregation. If only one missing value is needed for a variable then we recommend strongly that this value be specified using the _FillValue attribute. " The second sentence is already present in the standard. We may need to define what "when necessary" means... Cheers, Jon -- Dr Jon Blower Technical Director, Reading e-Science Centre School of Mathematical and Physical Sciences University of Reading, UK Tel: +44 (0)118 378 5213 Mob: +44 (0)7919 112687 http://www.resc.reading.ac.uk _______________________________________________ CF-metadata mailing list [email protected]<mailto:[email protected]> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
<<inline: image001.png>>
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
