John Graybeal wrote: > This prompts me to observe that somehow, in this brave new age of computer > programming, people are > developing netCDF software that supports Unicode characters -- Unicode!! -- > in variable (attribute > etc) names. There will be netCDF files in the wild, used by scientists and > normal people (especially > normal people from non-English-speaking countries) that use all sorts of wild > and crazy characters > in their variable names. (Perhaps CF thinks these are "alphanumeric", in > which case I've found a > solution! The standard certainly is not explicitly ASCII-only.) By the way, > I was amazed to learn > that using Unicode in programming languages is starting to take hold.
Yes, since June 2008 we have supported use of Unicode characters in names in both netCDF-3 and netCDF-4 software. The intent was to make netCDF more suitable for international use, rather than to encode mathematical operations in variable names. But we were also responding to needs of some user communities, for example atmospheric chemists who wanted to be able to use standard notations for chemical species in variable names. Here's a small non-sensical example of ncdump output for a file containing Unicode names: http://www.unidata.ucar.edu/netcdf/workshops/most-recent/utilities/Unicode.html The precise rules for netCDF names are in the format documentation, but the short version is: ... The first character of a name must be alphanumeric, a multi-byte UTF-8 character, or '_' (reserved for special names with meaning to implementations, such as the “_FillValue” attribute). Subsequent characters may also include printing special characters, except for '/' which is not allowed in names. Names that have trailing space characters are also not permitted. That document also warns: Note that by using special characters in names, you may make your data not compliant with conventions that have more stringent requirements on valid names for netCDF components, for example the CF Conventions. > At some point, we in the CF-supporting community are going to have to support > the standard practices > in this aspect that are going on everywhere else in the software world, or > decide we want a > permanent back-water for the 'scientists who are not interested in or capable > of supporting these > practices' (not my claim). > > Perhaps there are some reasons to want less-restrictive variable names -- > I'm not always > that imaginative, but if so, then present them. > > Let's just make the list so far, to get everyone up to speed with the > discussion: > * easier visual parsing (taste, yes, but practical also if you work with lots > of data sets from > different communities) > * embedding semantic meaning (taste) > * clearly isolating the context (namespace, hierarchy) > * matching attribute names that come from the source data > * consistency with netCDF usage/files -> easier onboarding of those files > * Unicode/internationalization support --Russ _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
