On 1/15/2014 10:28 AM, John Graybeal wrote:
Perhaps a reference to the IETF RFC 2119 [1] (which defines these and a few 
other related terms) should be added to the CF spec
Yes please, since discussion on this thread has already varied in its 
understanding/application of those terms.

The ambiguity in the sentence "No variable or dimension names are standardized by this convention." 
is also relevant. It could mean "This convention defines no requirements about variable or dimension 
names." or "This convention does not specify any particular variable or dimension names." The 
former meaning obviously reinforces the interpretation that 'should' is not a requirement.

It feels like we are veering towards hair-splitting, no? CF contains a clear (if overlay polite) statement about the proper way to create a variable name: "/Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores/". Clarification of the word "should" would be useful, yes, but the discussion would be highly unlikely to end up changing foundation compliance guidelines that have been in CF since the COARDS days.

Since it follows the preceding sentence, "/This convention does not standardize any variable or dimension names./" also seems quite clear. The loophole that is implied here -- that CF does not standardize variable and dimension names, but other groups may do so -- has been usefully exploited by groups like OceanSites, who have chosen to standardize their own names and naming patterns sitting atop CF as a normative standard.

While the arguments pushing for the restrictive naming convention (_ as the 
only special character) are perhaps not strong, for my own use I don't have a 
compelling use case on the need for more characters either. Mostly this is a 
matter of personal taste -- I like being able to use . and - to help with 
visual parsing and + and @ for semantic reasons, and they help reduce the 
number of likely prefix collisions (which a single separator doesn't help with 
at all).
Agree. There are factors sitting in the balance pans on both the pro and con side. Special syntax names allow one to create very concise names with (we hope) self-evident meanings. When you are the person engaged in the act of defining a new file, this is especially attractive. But over the lifecycle of the data -- considering data discovery and data usage in a wide range of contexts -- the special syntax characters come back to bite you time and again.

Mike's example of an embedded "dot" is an interesting one because it cuts both ways. Yes, there are times when creating CF files where it seems convenient to embed "." into a name in order to preserve a hierarchy from the software of origin. But there will then be downstream situations that we make a muddle of when those applications want to use the same approach to designate a different hierarchy. For example, downstream applications that want to refer to varname.attributename are forced into ugly hacks like "var.name.with.dots".attributename. (Admittedly, this Pandora's box has already been opened. We are already forced to contend with this today.)

A point I feel we ought to remind ourselves of, is that in an issue like the naming of variables we should try to put ourselves into the head space of the users of the data -- scientists. Funky looking camel-case strings are bread and butter to software developers, but not so much to the sensibilities of scientists (particularly older ones).

There is also a social benefit from relaxing the CF almost-standard: 
on-boarding. We want to encourage netCDF users to transition to CF. Minimizing 
the number of inconsistencies seems practical and forward-thinking. Forcing a 
netCDF user (which may include lots of HDF users too, these days) to abandon 
established attribute names is a significant cost for the affected users, now 
and going forward.
I agree that this is a valid consideration. There is gray surrounding this issue.

    - Steve

John


On Jan 15, 2014, at 10:00, Ethan Davis <[email protected]> wrote:

Hi all,

The use of "should" may, by many, be interpreted as a recommendation
rather than as a requirement.

Though the terms "must", "should", and "may" are used throughout the CF
spec, I am not finding any text that defines those terms.

Perhaps a reference to the IETF RFC 2119 [1] (which defines these and a
few other related terms) should be added to the CF spec. Though it seems
that might require a fairly full review of the uses in CF of the terms
defined in RFC 2119.

Ethan

[1] http://www.ietf.org/rfc/rfc2119.txt

On 1/15/2014 10:46 AM, Karl Taylor wrote:
All,

Yes, that statement seems quite definitive and unambiguous, and for the
reasons stated in other emails, I support retaining it.

regards,
Karl

On 1/15/14 9:37 AM, Steve Hankin wrote:
On 1/15/2014 9:24 AM, Jim Biard wrote:
Chris,

The point is, the Conventions themselves state that there is *no
standard*.  People are all the time trying to add meaning to variable
names, but the standard actually states that the meaning is to reside
in the attributes.  The variable names are just keys for
differentiating the variables.  (I could name all my variables
“vNNNNNNNNNN”, where N is a digit, and I would be completely valid
according to the standard.)  The long_name and standard_name
attributes are the places where descriptors of the variable content
are to be found.

So I’m raising a question. _ Is there actually anything other than
sentiment (i.e., an actual rule) that anyone can point to that
prevents someone from using “new” characters in their variable names?_
How about the lines from the CF document that you cut-pasted (thank you):

    /Variable, dimension and attribute names should begin with a
    letter and be composed of letters, digits, and underscores. Note
    that this is in conformance with the COARDS conventions, but is
    more restrictive than the netCDF interface which allows use of the
    hyphen character. The netCDF interface also allows leading
    underscores in names, but the NUG states that this is reserved for
    system use./

    - Steve
Grace and peace,

Jim

CICS-NC <http://www.cicsnc.org/>Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900





On Jan 15, 2014, at 12:00 PM, Chris Barker <[email protected]
<mailto:[email protected]>> wrote:

On Wed, Jan 15, 2014 at 7:39 AM, jbiard <[email protected]
<mailto:[email protected]>> wrote:

    I don't think we should use ease of mapping variable names to a
    programming language as a reason for allowing (or not allowing)
    any particular character in variable names.

Why not? maybe not a compelling reason, but I can't imagine a
compelling reason to have more flexible naming conventions, either.

    CF has, as I understood it, considered variable names as
    completely up to the producer, relying on attributes to provide
    meaning.  So, I can name a temperature variable "fluffy_bunny"
    if I want to, and it is completely valid.

valid yes, a good idea? probably not.

    Section 1.3 of the Conventions states, "No variable or dimension
    names are standardized by this convention."

so there are no standard variable names -- that's not the same as
standards for variable names....

Personally, I wish there were standards for variable names, it would
make it easier to code against -- but that cat's out of the bag. But
this cat isn't: the restiricitons have been there for a long time,
so the question now is:

what are the reasons for easing those restrictions?

and

what are the reasons for keeping those restrictions?

we've given a few reasons for keeping them (maybe not all  that
compeling toyou, but reasons none the less) -- what are the reasons
for relaxing them, other than "I like this naming convention that is
currently not allowed" ?

I'm not convinced that "fluffy-bunny" is any more readable or
anything else than "fluffy_bunny"

-Chris


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

[email protected] <mailto:[email protected]>
_______________________________________________
CF-metadata mailing list
[email protected] <mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

--
Ethan Davis                                       UCAR Unidata Program
[email protected]                    http://www.unidata.ucar.edu
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
------------------------------------
John Graybeal
Marine Data Manager

M +1 408 675-5445
skype: graybealski
Marinexplore
920 Stewart Drive
Sunnyvale 94085
California, USA
www.marinexplore.com

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to