On Sun, Jun 1, 2014 at 7:06 PM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > On Sun, 01 Jun 2014 18:31:09 +1000, Chris Angelico wrote: > >> the better solution is to permit the full Unicode alphabet in >> identifiers... > > I'm not entirely sure about that. Full Unicode support in identifiers > such as URLs doesn't create a brand new vulnerability, but it does > increase it from a fairly minor problem to something *much* harder to > deal with. It's bad enough when somebody manages to fool you into going > to (say) app1e.com instead of apple.com, without also being at risk from > аррlе, аpрlе, арplе and аррle (to mention just a few). At least nobody > can fake .com with .соm. > > To put it another way: > > py> аррlе = 23 > py> apple = 42 > py> assert аррlе == apple > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > AssertionError
Yeah, that is a concern. But as you say, it's already possible to confuse rn with m (in many fonts) and i/l/1, and (on a different level) Foo, foo, _foo, _Foo, and FOO, or movement_Direction and movement_direction. If you saw one of those in one part of a program and another in another, you'd have to consume an annoying amount of mindspace to keep them separate. Note, incidentally, that I said "alphabet" rather than the entire Unicode character set. I do *not* support the use of, for instance, U+200B 'ZERO WIDTH SPACE' in identifiers, that's just stupid :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list