On 17.09.2017 at 12:53, Rowan Collins wrote:

> I checked the PHP lang-spec repo expecting to find a set of Unicode classes, 
> but it currently mentions "U+0080-U+00FF": 
> https://github.com/php/php-langspec/blob/master/spec/09-lexical-structure.md#names
>  That seems wrong to me, unless I'm looking at the wrong definition - the 
> first part of that range is control characters, and you can have variables 
> called things like $๐Ÿ˜ (with an emoji as the entire name).

The specification in the PHP manual[1] appears to be more appropriate
for our current implementation:

| As a regular expression, it would be expressed thus: '[a-zA-Z_\x7f-
| \xff][a-zA-Z0-9_\x7f-\xff]*'

With regard to control characters: that depends on the chosen character
encoding; for instance in Windows-1252 the ยข character is mapped to \xA2.

[1] <http://php.net/manual/en/language.variables.basics.php>

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to