On 12-07-06 9:07 AM, Behdad Esfahbod wrote:
So, my inclination is to follow your suggestion and actually go with the C and
C++ style. But it's not exactly universal!

Thanks for the survey!  Indeed.  Programming languages is not my strong suit.

There was some further followup here, as well as in the bug:

https://github.com/mozilla/rust/issues/2800

and I wonder if you could take a moment to comment further (indeed, if anyone wants to simply state their pre-existing bias here, it's not a bad topic to just "take a survey" on):

Given the following:

  - our strings _are_ forced to be well-formed utf8 -- that is,
    we won't in any case allow you to write a non-utf8 byte in the
    middle of a string.

  - Sub-0x7f and super-0x1000 codepoints are unaffected by this
    choice, no matter what we do, as their escapes are unambiguous.

It seems to me we're really only talking about the likelihood of a user wanting to denote, directly, the 0x80 .. 0xff utf8 code units. And comparing that with the likelihood of a user wanting to denote a codepoint in that codepoint range, _as a codepoint_, and have it expand to the appropriate utf8 code units.

I remain torn on it. Most responses I've got so far err towards "do what C does", but there's reason to suspect that is a bad move also, since C strings don't enforce utf8-ness.

Further thoughts? Which cases do you find yourself writing utf8 code units in?

-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to