[rust-dev] Unicode identifiers

Graydon Hoare Fri, 25 Feb 2011 11:38:26 -0800

Hi,

I came across some 3rd party discussion of my choice of ASCII-rangeidentifiers (and limitation of non-ASCII-range unicode to strings, charsand comments) that cited this as a major problem in the language. Thisprompted a little more research and reading on my part, and talking withpeople who had differing experiences with non-English identifier use inprogramming languages. I now believe that my earlier impression of"almost universal" adoption of ASCII-range identifiers in non-Englishprogramming shops was mistaken, an that there is actually substantialvalue to such programmers in having non-ASCII range available.

Moreover, looking at the approach taken by PEP 3131 (delegating to theNFKC-normalization-closed sets defined in UAX 31,XID_Start/XID_Continue), I see the "proper solution" has abetter-established consensus than I had previously understood to exist.So I've updated the Rust manual to delegate to these specifications aswell, and filed a bug (issue 242, if anyone wants to jump on it) to getthe lexer patched up to handle this change.

Practical implications of this change are few for people (a) alreadycomfortable with ASCII-range identifiers or (b) working outside thelexer. Hopefully it'll make things more welcome for people who don't fitin to case (a) though.

Apologies for the trashing about on this issue, I misunderstood thecurrent state of play (possibly due to a little too much time spent indespair while trying to upgrade ECMAScript 4 to "any Unicode spec after1995", but that's a whole other story...)


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] Unicode identifiers

Reply via email to