Seth, On Thu, May 23, 2013 at 9:51 AM, Seth McGinnis <[email protected]> wrote: >>> Computing the min & max on the fly is cheap, and approximating it is even >>> cheaper, so why introduce the uncertainty? >> >>... but computing min & max on the fly can also be very expensive. >>We have aggregated model output datasets where each variable is more >>than 1TB! > > Sure, I can see that that's useful metadata about the dataset, and that > there's value in caching it somewhere. I just don't think it belongs with > the metadata inside the netcdf file. What's the use case for storing it > there? > > Because the problem remains that, unless you're storing and serving > that dataset as a single 1 TB file that never gets modified or subset, > as soon as anything at all happens to the file, those min and max > values become tainted and unreliable, and ought to be recomputed.
That's a great point. Funny, because I've been making the same arguments against storing time and space extents in the netcdf file, which was first suggested here: http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html and now being revisited here: http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working Thanks for snapping me back to reality! -Rich -- Dr. Richard P. Signell (508) 457-2229 USGS, 384 Woods Hole Rd. Woods Hole, MA 02543-1598 _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
