On 12-07-06 9:07 AM, Behdad Esfahbod wrote:
So, my inclination is to follow your suggestion and actually go with the C and
C++ style. But it's not exactly universal!
Thanks for the survey! Indeed. Programming languages is not my strong suit.
There was some further followup here, as well as in the bug:
https://github.com/mozilla/rust/issues/2800
and I wonder if you could take a moment to comment further (indeed, if
anyone wants to simply state their pre-existing bias here, it's not a
bad topic to just "take a survey" on):
Given the following:
- our strings _are_ forced to be well-formed utf8 -- that is,
we won't in any case allow you to write a non-utf8 byte in the
middle of a string.
- Sub-0x7f and super-0x1000 codepoints are unaffected by this
choice, no matter what we do, as their escapes are unambiguous.
It seems to me we're really only talking about the likelihood of a user
wanting to denote, directly, the 0x80 .. 0xff utf8 code units. And
comparing that with the likelihood of a user wanting to denote a
codepoint in that codepoint range, _as a codepoint_, and have it expand
to the appropriate utf8 code units.
I remain torn on it. Most responses I've got so far err towards "do what
C does", but there's reason to suspect that is a bad move also, since C
strings don't enforce utf8-ness.
Further thoughts? Which cases do you find yourself writing utf8 code
units in?
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev