Let the bike shedding continue! On Wed, Jan 15, 2014 at 1:14 PM, John Graybeal <jbgrayb...@mindspring.com>wrote:
> I don't think multiple use cases from different individuals and > communities should be categorized as "no reason other than maybe taste". > Just sayin'... > > multiple use-cases are examples not reasons -- "I'd like to do that", or "I've been doing that" doesn't give a why... though you do below, thanks. (and it certainly shouldn't be removed completely -- variable names with arbitrary bytes in them would really be a mess). Is it ascii-only now? it probably should stay that way. This prompts me to observe that somehow, in this brave new age of computer > programming, people are developing netCDF software that supports Unicode > characters -- Unicode!! -- in variable (attribute etc) names. > I'm a fan of unicode, actually, but despite it being around a long time, now, it's still a pain in the *&%&^ in C, C++, and, I'm guessing, Fortran. Not so bad in more modern languages, though apparently some use UTF-16 and don't always handle the larger code points correctly. So still a pain. And as you can tell, I'm a fan of restricting names to particular classes of characters, and unicode includes a lot of concepts that are pretty hard to define: e.g. "alphanumeric". I can see how it owuld be really nice for non-english speakers or math and science geeks to use all sorts of great variable names, but Im afraid opening up fully might more of a nightmae than it is worth. My pet programming language, python, currently allows unicode variable names, with restrictions, but his is a heck of a list to keep track of! http://www.dcl.hpi.uni-potsdam.de/home/loewis/table-3131.html > There will be netCDF files in the wild, used by scientists and normal > people (especially normal people from non-English-speaking countries) that > use all sorts of wild and crazy characters in their variable names. > (Perhaps CF thinks these are "alphanumeric", in which case I've found a > solution! The standard certainly is not explicitly ASCII-only.) By the > way, I was amazed to learn that using Unicode in programming languages is > starting to take hold. > but still only starting.... At some point, we in the CF-supporting community are going to have to > support the standard practices in this aspect that are going on everywhere > else in the software world, or decide we want a permanent back-water for > the 'scientists who are not interested in or capable of supporting these > practices' (not my claim). > I think unicode is a red herring for this issue -- not that it isn't interesting, but for sure full unicode options would allow nice expressive variable names, but I'd still rather have variable names that don't look like math expressions, and aren't legal names in programing languages. The current CF document says "Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores." but "letters" is not very well defined when you get outside of ascii -- it seems we have work to do. > > > Perhaps there are some reasons to want less-restrictive variable names -- > I'm not always that imaginative, but if so, then present them. > > > Let's just make the list so far, to get everyone up to speed with the > discussion: > * easier visual parsing (taste, yes, but practical also if you work with > lots of data sets from different communities) > * embedding semantic meaning (taste) > * clearly isolating the context (namespace, hierarchy) > I'm having trouble seeing how adding math symbols, etc will help these -- they can be done pretty well with underscores... > * matching attribute names that come from the source data > * consistency with netCDF usage/files -> easier onboarding of those files > mixed bag here -- CF is intended to be more restricted than netcdf.... * Unicode/internationalization support orthogonal question, I think. unless there's a language that uses "+" as a letter.... I think we've only heard from me and Steve saying we didn't like this proposal -- don't take our work on it! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata