On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds <a...@ajf.me> wrote:
> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>
I'm okay with producing UTF-8 even though our strings are technically
binary.  As you state, UTF-8 is the de-facto encoding, and recognizing
this is pretty reasonable.

You may want to make it a requirement that strings containing \u
escapes are denoted as:   u"blah blah"    We set aside this format
back in the PHP6 days (note that b"blah" is equivalent to "blah" for
binary strings).

On the BMP versus SMP issue of \uXXXX styles, we addressed this in
PHP6 by making \u denote 4 hexit BMP codepoints, while \U denoted six
hexit codepoints.   e.g.    "\u1234" === "\U001234"   I'd rather
follow this style than making \u special and different from hex and
octal notations by using braces.

-Sara

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to