Seth,

On Thu, May 23, 2013 at 9:51 AM, Seth McGinnis <[email protected]> wrote:
>>>  Computing the min & max on the fly is cheap, and approximating it is even
>>> cheaper, so why introduce the uncertainty?
>>
>>...  but computing min & max on the fly can also be very expensive.
>>We have aggregated model output datasets where each variable is more
>>than 1TB!
>
> Sure, I can see that that's useful metadata about the dataset, and that
> there's value in caching it somewhere.  I just don't think it belongs with
> the metadata inside the netcdf file. What's the use case for storing it
> there?
>
> Because the problem remains that, unless you're storing and serving
> that dataset as a single 1 TB file that never gets modified or subset,
> as soon as anything at all happens to the file, those min and max
> values become tainted and unreliable, and ought to be recomputed.

That's a great point.  Funny, because I've been making the same
arguments against storing time and space extents in the netcdf file,
which was first suggested here:
http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html
and now being revisited here:
http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working

Thanks for snapping me back to reality!

-Rich
--
Dr. Richard P. Signell   (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to