Happy Friday, internals!

Prior to PHP 7, any "invalid" escape sequences within strings (as far as I
can see) were ignored and the characters treated literally. For example:
"\xGG" ("broken" hex sequence) gives "\xGG", "\99" ("broken" octal
sequence) gives "\99", "\m" (not a recognised sequence at all) gives "\m"
and so on.

PHP 7 introduced a new escape sequence for unicode codepoints "\u{...}".
This deliberately breaks away from the pack and raises a Parse Error when
an escape sequence starting with "\u{" is not followed by the required
characters to make it a "valid" escape sequence (i.e. 1 to 6 hex characters
followed by a curly brace).

Why does \u{} behave differently for any other escape sequence? Because the
author prefers it that way,and indeed thinks all "invalid" escape sequences
should result in the same error. [pers. comm.]

The question I'd like to bring forward is: can we either:

a) change all other "invalid" escape sequences to be a parse error [that
would mean "\m" would raise a parse error!]

b) change \u{} to behave like any other escape sequence, by not raising a
parse error and instead keeping the literal characters

or c) tell me to keep quiet and accept the oddball behaviour, having quirks
is The PHP Way after all.

Either way, I'd like to see some resolution to this sooner rather than
later as we're very late in the PHP 7.0.0 game.

Cheers, and enjoy your weekends,

Peter

Reply via email to