Ken, Craig,
re: ACDD and CF
You are right, Craig, to phrase your question in terms of "guidance on
the future aspirations of CF". This topic deserves a more focussed
discussion within CF. Shooting from the hip, I'd be inclined to offer
these comments (recognizing that there is disagreement among individuals
who have discussed this):
1. Many of the ACDD attributes (history, date_created, creator_name,
...) are largely non-controversial.
2. CF generally favors attributes attached to variables over
attributes attached to files, as it reduces the potential for
conflicts. Conflicts from subsetting: What happens if you
extract a single variable from a file to make a new file?
Conflicts from editing: Suppose only a single variable in a CF
file is altered.
3. Some of the ACDD discovery attributes are redundant with respect
to information already in the CF metadata, but is encoded by other
means. For example, the ACDD geospatial_lon_min/max can be
inferred from the CF coordinate system information. Redundant
information only becomes a problem through its potential to lead
to corruption. Example, a conflict arises with the global
attributes time_coverage_start
<http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html#time_coverage_start_Attribute>/end
when files are aggregated in time by ncML (a very common situation).
My own personal slant on this debate is that much of the ACDD content
would better placed in THREDDS metadata, than in the file itself.
THREDDS servers, such as TDS and HYRAX, could "intelligently" generate
many ACDD attributes based upon contents in the file. This approach
would eliminate many of the potential issues of redundancy and
conflicting information in a file. Full use of ACDD as global
attributes tends to lock us into maintaining the integrity of a "file".
If you believe that the future growth direction of netCDF and CF lies in
subsetting and aggregation capabilities (as I do) then the ACDD runs the
risk of painting you into some corners where you rather not be.
At this point, the best (though still lame!) advice would seem to be to
use the attributes thoughtfully, rather than carelessly.
- Steve
==========================================
Kenneth Casey wrote:
Craig - to be absolutely clear: the ACDD attributes in no way
conflict with CF. They just provide some recommendations on what
names to use for some attributes. Using a common set of attribute
names enables us to build tools around those attributes that work well
across different data sets. Within NOAA for example there is a
project called the Unified Access Framework that has linked together
dozens of disparate THREDDS Data Servers through a single THREDDS
catalog. The larger number of data sets in that catalog that use the
ACDD the easier it is to build and maintain a dynamic crawler to
update that catalog on a regular interval. Also, it becomes possible
to extract automatically ISO "discovery level" metadata and feed it
into standard search mechanisms thereby making it possible to find
what you want amidst that sea of information. Other groups have built
tools to automatically crawl these attributes to assess the data in
terms of it's metadata robustness. That knowledge is useful for a
variety of purposes.
I will be interested to hear what folks on this list have to say about
CF "taking up" the ACDD recommendations. That might be fine but I am
not sure it is necessary. ACDD is focused purely on improving
discovery. CF focuses on other things like usability and
understanding, at least as far as I understand it.
Ken
--
Kenneth S. Casey, Ph.D.
Technical Director
NOAA National Oceanographic Data Center
1315 East-West Highway
Silver Spring MD 20910
301-713-3272 x133
http://www.nodc.noaa.gov
On Jul 10, 2010, at 5:38 AM, Craig Donlon <[email protected]
<mailto:[email protected]>> wrote:
Dear all:
CF is quite light on global metadata and metadata suitable for data
discovery and interoperability. Within the Group for High Resolution
Sea Surface Temperature (GHRSST, see http://www.ghrsst.org) we are
updating our product technical specifications (GDS) documentation.
We want to provide more flexibility and interoperability with our
products in a 'future proof' manner. GHRSST is handling 25Gb data
per day in an international context with many thousands of files in
NRT.
Our latest specs. have included the NetCDF Attribute Convention for
Dataset Discovery
(ACDD http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html)
and this has raised some questions about our CF compliance. I
realise that CF allows extensions, but what I am asking for is some
guidance on the future aspirations of CF for discovery metadata. I
like the ACDD recommendations and Ideally, I would like to be able to
write in our GHRSST data products that we are fully CF compliant.
Does the CF community anticipate taking up the
ACDD recommendations in the near future? What are peoples thoughts on
CF and improved metadata discovery?
I look forward to your comments and advice,
Best regards
Craig Donlon (Chair of the GHRSST International Science Team)
--
Dr Craig Donlon
Principal Scientist for Oceans and Ice
ESA/ESTEC (EOP-SME)
Keplerlaan 1, 2201 AZ
Noordwijk The Netherlands
t: +31 (0)715 653687
f: +31 (0)715 655675
e: [email protected] <mailto:[email protected]>
m:+31 (0)627 013244 (*new*)
Skype ID:crazit
altE-mail: [email protected] <mailto:[email protected]>
------------------------------------------------------------------------
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
Steve Hankin, NOAA/PMEL -- [email protected]
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744
"The only thing necessary for the triumph of evil is for good men
to do nothing." -- Edmund Burke
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata