I agree that some use cases would be helpful. I'm not sure about the specific 
proposal that initiated the discussion, but I do agree with the thought behind 
it that we should have a considered and reasoned policy on this, rather than 
just having a frozen-in rule based on past library constraints. 

One reason that we might want to depart from the full freedom allowed in NetCDF 
is that we have, in CF, a range of different attributes to describe a variable. 
The `long_name` is designed to hold human readable text, the `standard_name` 
and `units` which both have strongly constrained values. 

Some application libraries need, in places, identifiers with a restricted 
character set. For example, I can construct a `collections.namedtuple` with 
name `tas`, but not with name `tas.Amon` because, in python "Type names and 
field names can only contain alphanumeric characters and underscores" (cited 
from an error message generated by `collections.namedtuple`). Could this be 
considered as a use case for having place in the convention to specify, for CF 
objects, an identifier which is composed of "alphanumeric characters and 
underscores"? The variable name is the de facto place which many people use for 
this kind of identifier (perhaps because of legacy packages). 

Note that the `standard_name` fits the character restriction, but does not fit 
the use case because different variables may have the same `standard_name`.

Another potential use case is for identifiers of concepts described in [RDF 
Turtle](https://www.w3.org/TR/turtle/) which has a character restriction on 
object names, broader, I think, than  "alphanumeric characters and 
underscores", but definitely narrower than 137 thousand available of UTF-8.

The desire to have a simple identifier is linked, in my mind at least, to the 
concept of a namespace, which is being discussed in the context of NetCDF (see 
[NetCDF-ld](https://github.com/opengeospatial/netcdf-ld) and discussion on 
[namespace delimiters](https://github.com/opengeospatial/netcdf-ld/issues/50)). 
I don't this is simply a matter of upgrading software to make it accept generic 
strings: there is a wide range of applications that exploit identifiers 
constructed from a limited character set in order to enable the use of 
identifiers within an text string.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/237#issuecomment-728814702

This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Reply via email to