Compatibility variants can look different, but they can also look identical.
Allowing any non-ASCII characters was worrisome because of the security
implications of confusables. Squashing compatibility characters seemed the
more conservative choice at the time. Stestagg's example:
е = lambda е, e: е if е > e else e
shows it wasn't perfect, but adding more invisible differences does have risks,
even beyond the backwards incompatibility and the problem with (hopefully rare,
but are we sure?) editors that don't distinguish between them in the way a
programming language would prefer.
I think (but won't swear) that there were also several problematic characters
that really should have been treated as (at most) glyph variants, but ...
weren't. If I Recall Correctly, the largest number were Arabic presentation
forms, but there were also a few characters that were in Unicode only to
support round-trip conversion with a legacy charset, even if that charset had
been declared buggy. In at least a few of these cases, it seemed likely that a
beginning user would expect them to be equivalent.
-jJ
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/GNT3AG2SCVLMCJAZXSTIWFKKAYG25E7O/
Code of Conduct: http://python.org/psf/codeofconduct/