Dear Karl et al. Standard names are certainly a difficult business and it's a good idea to discuss how we should be dealing with them. They are much more than names, as Julia Collins remarked.
In your email, Karl, I am unclear whether you are proposing to replace the single standard_name attribute with many attributes, or to construct the single attribute by a more systematic procedure. I think there are several arguments in favour of having a single attribute: * There are several possible optional qualifying bits of information (at_SURFACE, due_to_PROCESS, assuming_CONDITION, etc.). With a single attribute, you know at once what is specified. With separate attributes, you would need a separate way to find out what attributes might possibly be specified, in order to check whether they were there. * It makes sure that essential definitive information is included, such as the sign-convention. Separate attributes can be accidentally omitted. * It is more convenient in a program to examine a single attribute to find what you want. If the information is in several attributes, you need to have several if-conditions ANDed together. (Of course, sometimes this is necessary anyway if there is a condition on coordinates or cell_methods too, but we don't want to make things unnecessarily difficult.) * The order of the bits of information may be significant e.g. the two components of a tensor, or the order of transformations. If they are in separate attributes, the ordering would be more awkward to record. In favour of many attributes, you suggest that you might want to contain the data for quantities which currently have various standard names in a single data array e.g. concentrations of different chemical species, contributions of various processes. This might be desirable, but I'm not convinced. This is being discussed in the chemistry thread, where it has been remarked that although chemical models *do* have dimensions over chemical species, it is not essential to write out the data in that way. But it could be done, by making the chemical species a coordinate dimension, as we have been discussing. It could be done in general by making the standard_name a coordinate variable, instead of an attribute, if there is a strong reason for doing it. Of course, I am not arguing entirely against having more than one attribute. We do have cell_methods and standard_name qualifiers as well as basic standard names and coordinates, for instance. I think there's a good reason to separate out qualifiers which are always or usually relevant. Supposing we continue with a single attribute, the issue is whether we can construct it with less effort. You say yourself that this is perhaps the more important objective. I agree that if we had a system for assembling new standard names from existing components, it would be useful. It would help people find out whether there was in fact already a name for something, and it would make sure we put things in the same order and used the same phrases wherever relevant. I believe there is actually more system to the existing names than the Guidelines indicate, and an automatic system would make this apparent. However, the cases which are extensions of existing patterns are already the easy ones. They are not the cases which take most of the time and effort to deal with, I think. I've said this before - do you think I'm mistaken in this perception? Consider recent examples: * The long debate about extreme statistics was mostly about how the metadata should be organised among standard name, coordinate variable and cell methods, not about the choice of standard name per se. * The thread about "date and time" is so far more about what we want to distinguish than how to do it. * I listed in another thread some questions that Stephen Griffies and I have been discussing for ocean quantities for CMIP5. These are the kind of decisions that took most time, not actually stringing together a name: - Basin masks for tracer and velocity are the same geophysical quantity, but distinguished by coordinates. - What does "ideal age" of sea water mean? - Is the mixed-layer depth determined by a buoyancy criterion the same concept as mixed-layer depth determined by sigma-theta? - Transports across various straits are all the same geophysical quantity, and the strait should be identified by some string-valued coordinate. - How do we most usefully categorise the various kinds of ocean mixing in a way which will be helpful for comparing models? - What is the clearest way to describe the energetics of vertical mixing: is it the rate of work against stratification, or the rate of change of potential energy? While a better description of what we are doing would clarify the existence of difficult cases and help us think about them, I don't think would reduce the hard work of deciding on the new distinctions, elements and constructions. However, I think it would still be valuable to follow this up. Because of the issue of ordering, and the large number of qualifiers, I think Robert Muetzelfeldt's description of the problem by using a grammar is more appropriate than using a number of independent attributes. Finally, I wonder whether you could say more about what you mean by "this standard_name business seems a bit out of control"? Best wishes Jonathan _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
