Hi John,
the problem is that the compound name is obvious for a human, but very hard
to extract for a machine, because we don't have a strict set of grammar rules.
What you are suggesting sounds almost like you wanted to replace the
standard_names by some other mechanism of controlled vocabulary, a collection
of URIs from different fields and different servers which would point to the
actual reference term in each case? Perhaps I got you wrong here, but I would
feel rather uneasy about going too far in this direction at present. We were
very happy to find out in Dublin that the community (of atmospheric chemists)
is beginning (!) to recognize standard_names as a valuable resource enabling
them to speak about the same thing with the same words (even though sometimes a
bit clumsy), and to have one "master list" of terms seems much simpler and more
resilient to me at present. Yet, it may be good to reflect within the
standard_name list what is often brought up in the list discussions anyhow,
that is that some communities have established controlled vocabulary for their
field, and - as far as I follow the discussions - this is usually a good
argument for accepting a standard_name proposal, unless it is in conflict with
other rules.
The specific situation in atmospheric chemistry (maybe not so specific but
at least very prominent) is that the "variable name space" is not
1-dimensional, but multi-dimensional, i.e. for each (new) compound we can
easily add a dozen or more new terms (= standard_names) which describe the
molar fraction or mass content in the atmosphere, emission or deposition fluxes
(due to a myriad individual processes if need be), chemical reaction rates or
turnover rates, etc. My proposal to add the compound_name and a URI/URL to the
accepted standard vocabulary list for compounds merely aims at making sure we
can link the various compound properties together, so that an application can
understand that "mole_fraction_of_trimethylbenzene_in_air" is linked to
"tendency_of_atmosphere_mass_content_of_trimethylbenzene_in_air_due_to_emissions_from_traffic",
for example. If you show me a parser that can extract all compound names from
the standard_name table and which would work for all future versions of the
standard_name table, then we might not need this (although the reference to a
controlled vocabulary list might still be useful and take a little
responsibility away from CF).
Cheers,
Martin
Von: John Graybeal [mailto:[email protected]]
Gesendet: Montag, 10. September 2012 18:28
An: Schultz, Martin
Cc: Lowry, Roy K.; [email protected]
Betreff: Re: [CF-metadata] Expanding the standard_name metadata
Congratulations on your great meeting!
Concur that when the name is derivable fairly obviously from the other matter,
it should not be required. In this case the CF name is supposed to be clear
enough that the compound name should be within it already. Suggest this be
available as an option if you value it highly (it is perhaps as much the label,
as the unique identifier?).
We are bootstrapping best semantic practices for a long lifetime of their use
(hopefully), and so having a URL (well, URI/IRI; yours works) is the principal
computational reference. (How does the computer know with some confidence what
the thing is?) Yes, definitely a web 2.0 kind of answer. Although a particular
unique identifier may no longer be maintained in 10 or 20 years, it is likely
enough of a 'standard reference' that it has been mapped to its replacement, or
even forward linked from the old URL. Absolute worst case, a web search should
find traces of it.
To generalize this (for creatures, phenomena, etc.), could we call it not
"compound_codelist", but "object_codelist" or "object_IRI", as the compound is
the direct object of the prepositional phrase? OK, that's pretty
grammar-centric and therefore obscure, but I see the names quickly described
via their mapped components (a great thing!). This is very much the first step
of that.
John
On Sep 10, 2012, at 02:35, Schultz, Martin wrote:
Hi Roy,
thanks for supporting this idea. Why include the "compound_name"? I didn't
really think about this, but only copied what is common practice in ISO
metadata files. They usually pair a name with the link to the controlled
vocabulary list. It could have to do with resilience. What do you do if the
controlled vocabulary server doesn't work at the time when you need it?
Actually, I would tend to think that the "compound_name" tag is the more
important one, and I would see the URL more in the sense of a bibliographic
reference. In a sense, this bibliographic reference lends some weight to the
name. But perhaps I am still living too much in the web 1.0 world?
Cheers,
Martin
Von: Lowry, Roy K. [mailto:[email protected]]<mailto:[mailto:[email protected]]>
Gesendet: Montag, 10. September 2012 11:03
An: Schultz, Martin; [email protected]<mailto:[email protected]>
Betreff: RE: Expanding the standard_name metadata
Hello Martin,
I really like the idea of linking the Standard Name to a resolveable URL for
the compound, but would question the need for adding the compound name to the
standard name table as well as the URL. The plaintext compound name has to be
included in the Standard Name and is available through resolution of the URL.
Why introduce a further duplicate of the information with the inherent risk of
discrepencies creeping in?
In a similar vein, should Standard Names get deeper into biological parameters
it would be good to include a link to the World Register for Marine Species
(WoRMS) for the taxon.
Cheers, Roy.
________________________________
From: CF-metadata
[[email protected]<mailto:[email protected]>] On
Behalf Of Schultz, Martin
[[email protected]<mailto:[email protected]>]
Sent: 10 September 2012 09:33
To: [email protected]<mailto:[email protected]>
Subject: [CF-metadata] Expanding the standard_name metadata
Dear all,
last week, we had a rather successful workshop on "Metadata for air
quality and atmospheric composition" in Dublin. It was nice to see that the
community (i.e. those present) seemed to agree without much discussion, that
ISO 19115 (-1) is the way to go for discovery metadata, while CF is the way
forward for descriptive metadata to be stored in (usually) netcdf data files.
The main discussions at the workshop centered around ISO issues, but there was
one interesting point that came up with respect to CF standard_names and their
relation to controlled vocabulary:
We did have discussions on this list earlier about a more grammar-oriented
approach, and this was also brought up at our workshop again, mainly in light
of the "threat" that the atmospheric composition group will soon begin to flood
this email list with hundreds of new names in order to add additional chemical
compounds. As we have seen with the problem of standard_names for emissions,
this is stretching the limits of the current ways to operate and publish new
standard_names. I don't want to argue against the concept of one "flat" master
list (we have been through this and there are good reasons for sticking to this
concept), but I would like to stipulate a discussion about adding more
"metadata" to the standard_name table in order to better link it to other
controlled vocabulary lists and avoid confusing inconsistencies, for example in
the naming of chemical compounds. Specifically, I would like to propose two
"conditional" tags compound_name and compound_codelist in the standard_name
list which shall appear for all standard_names having to do with chemical
compounds. Example:
-<entry id="atmosphere_mass_content_of_carbon_monoxide">
<compound_name>Carbon monoxide</compound_name>
<compound_codelist>http://rdfdata.eionet.europa.eu/airquality/components/10</compound_codelist><http://rdfdata.eionet.europa.eu/airquality/components/10%3c/compound_codelist%3e>
<canonical_units>kg m-2</canonical_units>
<description>"Content" indicates a quantity per unit area. The "atmosphere
content" of a quantity refers to the vertical integral from the surface to the
top of the atmosphere. For the content between specified levels in the
atmosphere, standard names including content_of_atmosphere_layer are used. The
chemical formula of carbon monoxide is CO.</description>
</entry>
In a way, this may be seen as duplication of information, but it would
really help to tie ends together, because it is practically impossible to parse
the standard_names in order to extract such information (due to the lack of a
strict grammar). There may be other tags which could be useful to add, and one
will have to decide about the pros and cons in each case. However, for compound
names I would see a clear need arising now.
Best regards,
Martin
PD Dr. Martin G. Schultz
IEK-8, Forschungszentrum Jülich
D-52425 Jülich
Ph: +49 2461 61 2831
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Kennen Sie schon unsere app? http://www.fz-juelich.de/app
--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
----------------
John Graybeal <mailto:[email protected]> phone: 858-534-2162
Product Manager
Ocean Observatories Initiative Cyberinfrastructure Project:
http://ci.oceanobservatories.org
Marine Metadata Interoperability Project: http://marinemetadata.org
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata