Re: [CF-metadata] non-standard standard_names -- CF alternative names

Steve Hankin Wed, 12 May 2010 12:23:12 -0700

Hi John,

Would it be right to think of the strategy you've outlined as anelaboration on "Alternative 1":


   Should the CF standard_name process, *itself*, include a
   "provisional fast-track", that allows names to be added very quickly
   with no guarantee that they will have a lasting status, but with an
   *iron-clad guarantee that the provisional names will be retained*
   (and so-identified) in version-stamped (older) CF vocabularies.

Presumably someone connected to CF would have to commit to maintainingthe RDF store and (one assumes) provide an on-going, well-supportedservice that can query it. Is this the vision you are suggesting?


   - Steve

================================

John Graybeal wrote:

OK, now I have to submit my other notion after all, which I thinkaddresses some of Steve's concerns. But let me semi-agree with hisfirst paragraph -- I'm enthusiastic, but I think there are a lot ofdetails to be agreed on. I'll come back to that in a separate post.
I had thought it was important to provide a way to enter proposed CFterms in a common way/place, so that they can (a) be used by theoriginators and the community in the meantime, (b) be seen by the CFfolks, and (c) be dispositioned appropriately when CF either acceptsthem or rejects them. So my proposal was to create a vocabulary, ormore precisely an RDF store, that lets us:
 1) declare a name that may be proposed as a CF candidate
2) make a statement that the name has been (or even 'is being')submitted to CF for consideration3a) make a statement that the name has been accepted as a CF name,and therefore is deprecated as a proposed name3b) make a statement that the name has been rejected as a CF name,and therefore is deprecated as a proposed nameIn either 3a or 3b,4) make a statement that the replacement representation of the nameis xyz in some other vocabulary
The relationship of this proposal to the previous thread is that itprovides an implementation mechanism for the life cycle of theprovisional terms. It also helps assure some of the things Steve istrying to ensure -- some of which only recently became possible withCF, and even that manually, not through any automatable utility,interface, or URI convention.
Anyway, I don't want to encourage a detailed discussion of the aboveproposal, as it is secondary to Martin's original suggestion, and Ifeel sure it will have to be considered at some length in TRAC if weget that far. Just wanted to mention that the semantic technologiescan enable some very useful views/approaches to some of these problems.
John

On May 12, 2010, at 11:22, Steve Hankin wrote:
Hi Martin,
You've had two enthusiastic "yes" responses, so I guess I have theprivilege to be the wet blanket. So it goes. I will give only avery cautious and limited "yes". Not an outright "no" ... but asuggestion for more thought and discussion.
The proposal here is effectively the creation of 'private tables' asa means of achieving extensibility. We've had an opportunity to seethe hazards embedded in this approach as a long-term evolutionaryprocess in WMO. Over time the "custom" tables evolve to have anquasi-official status -- entire sub-communities rely upon them -- butwithout necessarily a corresponding methodical control over theircreation and distribution. With BUFR and GRIB files the proliferationof distinct tables has lead to serious interoperability problems.
To avoid repeating these problems with your proposal, CF clients mustbe provided with *iron-clad ways to be assured that they arereferring to the same vocabulary tables that the data author wasreferring to at the time that the data were written*. Since we wantCF files to ensure interoperability when there are *years separatingthe writing of data from reading it*, your strategy needs to ensurecareful version control over the private tables. This imposes asignificant burden on you as the creator of a"<project>_standard_name" table -- essentially a requirement toretain and serve out older table versions "in perpetuity" (we couldargue over what that means). The use of semantic web technologieswill not alter these considerations for the foreseeable future (thoover the long term sophisticated inference engines might ...). Theontologies still need to be informed by correct information, whichimplies knowledge of the version-controlled private vocabularies.
A "<project>_standard_name" may have one of three life histories: itmay never become accepted into the standard_name table; it may beaccepted as-is; or it may be accepted with alterations. Thefollowing suggested restriction illustrates some of the difficulties:"A variable can contain either a standard_name or<project>_standard_name attribute but not both." What's behind thisrestriction? Given the uncertain life history of a<project>_standard_name, if it has been in use for (say) a year andis found in thousands of files that are being shared around thecommunity, doesn't that generate a need to continue support for it.
Two alternative approaches (both flawed, of course ... the nature ofthe beast):
   1. Should the CF standard_name process, itself, include a
      "provisional fast-track", that allows names to be added very
      quickly with no guarantee that they will have a lasting status,
      but with an *iron-clad guarantee that the provisional names
      will be retained* (and so-identified) in version-stamped
      (older) CF vocabularies.
      or
2. Might you be better off using a *truly private* vocabulary of"<project>_standard_name" strings. I.e. one that has no
      official status in CF at all?  There is no violation to the CF
      standard through doing this.  This approach makes it your
      private responsibility on behalf of your users to deal with
      files that are created in the period between proposing a CF
      standard_name and having it become part of the official table


    - Steve

====================

Schultz, Martin wrote:
Dear all,

    we are currently cleaning all files on our TFHTAP multi-model
experiment server to make them fully CF(1.0) conformant. It has been
about 3 years since we had drafted the original format description of
these experiments and also initiated the standard name discussion for
chemical constituents (thanks again to Christiane Textor who did a lot
of this initial work). Many standard names which we needed have now been
defined (thanks to all who contributed and to Allison for maintaining
the list!). Nevertheless, there are a number of model variables left for
which no standard name has been agreed upon and where we (or the CF
mailing list group) also felt that they are too specialized to deserve a
"standard" name. From the perspective of the CF community this may not
be an issue, but in the context of interoperability (we now operate a
WCS server to share these files) the fact that some variables do have a
standard_name attribute and others don't poses considerable challenges.
The CF convention states that "either standard_name or long_name" should
be present. In our view, the long_name attribute is a poor substitute
for the standard_name, because it has no rules attached. We are now
planning to substitute "illegal" standard_name attributes by a new
"htap-_standard_name" attribute, which shall make clear that these names
are derived according to the CF guidelines, but they are not accepted
standard_names. Such a concept would enable software tools to easily
scan additional standard_name tables and make use of the well-defined
semantics that a standard_name provides without having to push
additional standard_names through the discussion - in particular if they
are no so "standard". I can see the danger that certain groups might
think it no longer necessary to go through the tedious but ultimately
worthwhile discussion process in this mailing list and the meaning of
"standard" names could get diluted. However, in my view the advantage of
having the possibility to extend the convention without breaking
standard-conformance outweighs this potential disadvantage.

    Specifically I would thus propose to add an optional attribute to
the CF documents such as:

<project>_standard_name: use this attribute to define the meaning of
variables which have no accepted standard_name defined (yet). The
project name should be a single string without blanks or underscore
characters. These project-specific standard_names must follow the
guidelines for the construction of standard_names, but they will not be
evaluated by generic tools which test a data file for CF compliance.
Groups who wish to define such project-specific standard names should
first consider to submit their proposals to the CF mailing list for
inclusion in the CF standard name table. A variable can contain either a
standard_name or <project>_standard_name attribute but not both. A
long_name attribute is not needed when a <project>_standard_name is
given.


Best regards,

Martin


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
_______________________________________________
CF-metadata mailing list
[email protected] <mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected] <mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--------------
I have my new work email address: [email protected]<mailto:[email protected]>
--------------
John Graybeal <mailto:[email protected]>phone: 858-534-2162
System Development Manager
Ocean Observatories Initiative Cyberinfrastructure Project:http://ci.oceanobservatories.orgMarine Metadata Interoperability Project: http://marinemetadata.org

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Re: [CF-metadata] non-standard standard_names -- CF alternative names

Reply via email to