On 07/04/2012 02:53 PM, Graydon Hoare wrote: > On 12-07-04 6:55 AM, Behdad Esfahbod wrote: > >> * Here: "\xHH, \uHHHH, \UHHHHHHHH Unicode escapes", I strongly suggest >> that >> \xHH be modified to allow inputting direct UTF-8 bytes. For ASCII it doesn't >> make any different. For Latin1, it gives the impression that strings are >> stored in Latin1, which is not the case. It would also make C / Python >> escaped strings directly usable in Rust. Ie. '\xE2\x98\xBA' would be a >> single >> character equivalent to '\u263a', not three Latin1 characters. > > Heh. This is interesting! I hadn't noticed yet but you're not _entirely_ > giving the whole story. > > - \xNN means a utf8 byte: python2, python3 'bytes' literals, > perl, go, C, C++, C++-0x u8 literals, and ruby > > - \xNN means a unicode codepoint: python3 'string' literals, > javascript, scheme (at least racket follows spec; others > get it randomly wrong by implementation), and current rust. > > - \xNN illegal, but the octal version means a unicode codepoint: > java. > > So, my inclination is to follow your suggestion and actually go with the C and > C++ style. But it's not exactly universal!
Thanks for the survey! Indeed. Programming languages is not my strong suit. Cheers, behdad _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
