On Tue, Apr 1, 2014 at 7:44 AM, Chris Angelico <ros...@gmail.com> wrote:
> On Wed, Apr 2, 2014 at 12:33 AM, Ned Batchelder <n...@nedbatchelder.com> 
> wrote:
>> Maybe I'm misunderstanding the discussion... It seems like we're talking
>> about a hypothetical definition of identifiers based on Unicode character
>> categories, but there's no need: Python 3 has defined precisely that.  From
>> the docs
>> (https://docs.python.org/3/reference/lexical_analysis.html#identifiers):
>>
>
> "Python 3.0 introduces **additional characters** from outside the
> ASCII range" - emphasis mine.
>
> Python currently has - at least, per that documentation - a hybrid
> system with ASCII characters defined in the classic way, and non-ASCII
> characters defined by their Unicode character classes. I'm talking
> about a system that's _purely_ defined by Unicode character classes.
> It may turn out that the class list exactly compasses the ASCII
> characters listed, though, in which case you'd be right: it's not
> hypothetical.

The only ASCII character not encompassed is that _ is explicitly
permitted to start an identifier (for obvious reasons) whereas
characters in Pc are more generally only permitted to continue
identifiers.

There are also explicit lists of extra permitted characters in
PropList.txt for backward compatibility (once a character is
permitted, it should remain permitted even if its Unicode category
changes).  There are currently 4 extra starting characters and 12
extra continuing characters, but none of these are ASCII.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to