Re: [rust-dev] Unicode vs hex escapes in Rust

Graydon Hoare Fri, 06 Jul 2012 10:09:34 -0700

On 12-07-06 9:07 AM, Behdad Esfahbod wrote:

So, my inclination is to follow your suggestion and actually go with the C and
C++ style. But it's not exactly universal!


Thanks for the survey!  Indeed.  Programming languages is not my strong suit.


There was some further followup here, as well as in the bug:

https://github.com/mozilla/rust/issues/2800

and I wonder if you could take a moment to comment further (indeed, ifanyone wants to simply state their pre-existing bias here, it's not abad topic to just "take a survey" on):


Given the following:

  - our strings _are_ forced to be well-formed utf8 -- that is,
    we won't in any case allow you to write a non-utf8 byte in the
    middle of a string.

  - Sub-0x7f and super-0x1000 codepoints are unaffected by this
    choice, no matter what we do, as their escapes are unambiguous.

It seems to me we're really only talking about the likelihood of a userwanting to denote, directly, the 0x80 .. 0xff utf8 code units. Andcomparing that with the likelihood of a user wanting to denote acodepoint in that codepoint range, _as a codepoint_, and have it expandto the appropriate utf8 code units.

I remain torn on it. Most responses I've got so far err towards "do whatC does", but there's reason to suspect that is a bad move also, since Cstrings don't enforce utf8-ness.

Further thoughts? Which cases do you find yourself writing utf8 codeunits in?


-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Unicode vs hex escapes in Rust

Reply via email to