On 1/15/2014 10:28 AM, John Graybeal wrote:
Perhaps a reference to the IETF RFC 2119 [1] (which defines these and a few
other related terms) should be added to the CF spec
Yes please, since discussion on this thread has already varied in its
understanding/application of those terms.
The ambiguity in the sentence "No variable or dimension names are standardized by this convention."
is also relevant. It could mean "This convention defines no requirements about variable or dimension
names." or "This convention does not specify any particular variable or dimension names." The
former meaning obviously reinforces the interpretation that 'should' is not a requirement.
It feels like we are veering towards hair-splitting, no? CF contains a
clear (if overlay polite) statement about the proper way to create a
variable name: "/Variable, dimension and attribute names should begin
with a letter and be composed of letters, digits, and underscores/".
Clarification of the word "should" would be useful, yes, but the
discussion would be highly unlikely to end up changing foundation
compliance guidelines that have been in CF since the COARDS days.
Since it follows the preceding sentence, "/This convention does not
standardize any variable or dimension names./" also seems quite clear.
The loophole that is implied here -- that CF does not standardize
variable and dimension names, but other groups may do so -- has been
usefully exploited by groups like OceanSites, who have chosen to
standardize their own names and naming patterns sitting atop CF as a
normative standard.
While the arguments pushing for the restrictive naming convention (_ as the
only special character) are perhaps not strong, for my own use I don't have a
compelling use case on the need for more characters either. Mostly this is a
matter of personal taste -- I like being able to use . and - to help with
visual parsing and + and @ for semantic reasons, and they help reduce the
number of likely prefix collisions (which a single separator doesn't help with
at all).
Agree. There are factors sitting in the balance pans on both the pro
and con side. Special syntax names allow one to create very concise
names with (we hope) self-evident meanings. When you are the person
engaged in the act of defining a new file, this is especially
attractive. But over the lifecycle of the data -- considering data
discovery and data usage in a wide range of contexts -- the special
syntax characters come back to bite you time and again.
Mike's example of an embedded "dot" is an interesting one because it
cuts both ways. Yes, there are times when creating CF files where it
seems convenient to embed "." into a name in order to preserve a
hierarchy from the software of origin. But there will then be
downstream situations that we make a muddle of when those applications
want to use the same approach to designate a different hierarchy. For
example, downstream applications that want to refer to
varname.attributename are forced into ugly hacks like
"var.name.with.dots".attributename. (Admittedly, this Pandora's box has
already been opened. We are already forced to contend with this today.)
A point I feel we ought to remind ourselves of, is that in an issue like
the naming of variables we should try to put ourselves into the head
space of the users of the data -- scientists. Funky looking camel-case
strings are bread and butter to software developers, but not so much to
the sensibilities of scientists (particularly older ones).
There is also a social benefit from relaxing the CF almost-standard:
on-boarding. We want to encourage netCDF users to transition to CF. Minimizing
the number of inconsistencies seems practical and forward-thinking. Forcing a
netCDF user (which may include lots of HDF users too, these days) to abandon
established attribute names is a significant cost for the affected users, now
and going forward.
I agree that this is a valid consideration. There is gray surrounding
this issue.
- Steve
John
On Jan 15, 2014, at 10:00, Ethan Davis <[email protected]> wrote:
Hi all,
The use of "should" may, by many, be interpreted as a recommendation
rather than as a requirement.
Though the terms "must", "should", and "may" are used throughout the CF
spec, I am not finding any text that defines those terms.
Perhaps a reference to the IETF RFC 2119 [1] (which defines these and a
few other related terms) should be added to the CF spec. Though it seems
that might require a fairly full review of the uses in CF of the terms
defined in RFC 2119.
Ethan
[1] http://www.ietf.org/rfc/rfc2119.txt
On 1/15/2014 10:46 AM, Karl Taylor wrote:
All,
Yes, that statement seems quite definitive and unambiguous, and for the
reasons stated in other emails, I support retaining it.
regards,
Karl
On 1/15/14 9:37 AM, Steve Hankin wrote:
On 1/15/2014 9:24 AM, Jim Biard wrote:
Chris,
The point is, the Conventions themselves state that there is *no
standard*. People are all the time trying to add meaning to variable
names, but the standard actually states that the meaning is to reside
in the attributes. The variable names are just keys for
differentiating the variables. (I could name all my variables
“vNNNNNNNNNN”, where N is a digit, and I would be completely valid
according to the standard.) The long_name and standard_name
attributes are the places where descriptors of the variable content
are to be found.
So I’m raising a question. _ Is there actually anything other than
sentiment (i.e., an actual rule) that anyone can point to that
prevents someone from using “new” characters in their variable names?_
How about the lines from the CF document that you cut-pasted (thank you):
/Variable, dimension and attribute names should begin with a
letter and be composed of letters, digits, and underscores. Note
that this is in conformance with the COARDS conventions, but is
more restrictive than the netCDF interface which allows use of the
hyphen character. The netCDF interface also allows leading
underscores in names, but the NUG states that this is reserved for
system use./
- Steve
Grace and peace,
Jim
CICS-NC <http://www.cicsnc.org/>Visit us on
Facebook <http://www.facebook.com/cicsnc> *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900
On Jan 15, 2014, at 12:00 PM, Chris Barker <[email protected]
<mailto:[email protected]>> wrote:
On Wed, Jan 15, 2014 at 7:39 AM, jbiard <[email protected]
<mailto:[email protected]>> wrote:
I don't think we should use ease of mapping variable names to a
programming language as a reason for allowing (or not allowing)
any particular character in variable names.
Why not? maybe not a compelling reason, but I can't imagine a
compelling reason to have more flexible naming conventions, either.
CF has, as I understood it, considered variable names as
completely up to the producer, relying on attributes to provide
meaning. So, I can name a temperature variable "fluffy_bunny"
if I want to, and it is completely valid.
valid yes, a good idea? probably not.
Section 1.3 of the Conventions states, "No variable or dimension
names are standardized by this convention."
so there are no standard variable names -- that's not the same as
standards for variable names....
Personally, I wish there were standards for variable names, it would
make it easier to code against -- but that cat's out of the bag. But
this cat isn't: the restiricitons have been there for a long time,
so the question now is:
what are the reasons for easing those restrictions?
and
what are the reasons for keeping those restrictions?
we've given a few reasons for keeping them (maybe not all that
compeling toyou, but reasons none the less) -- what are the reasons
for relaxing them, other than "I like this naming convention that is
currently not allowed" ?
I'm not convinced that "fluffy-bunny" is any more readable or
anything else than "fluffy_bunny"
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
[email protected] <mailto:[email protected]>
_______________________________________________
CF-metadata mailing list
[email protected] <mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
--
Ethan Davis UCAR Unidata Program
[email protected] http://www.unidata.ucar.edu
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
------------------------------------
John Graybeal
Marine Data Manager
M +1 408 675-5445
skype: graybealski
Marinexplore
920 Stewart Drive
Sunnyvale 94085
California, USA
www.marinexplore.com
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata