[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

Christopher Barker Sun, 14 Nov 2021 11:12:13 -0800

On Sun, Nov 14, 2021 at 10:27 AM MRAB <pyt...@mrabarnett.plus.com> wrote:


> > So why does Python apply  NFKC normalization to variable names??



> It's probably to deal with "é" vs "é", i.e. "\N{LATIN SMALL LETTER
> E}\N{COMBINING ACUTE ACCENT}" vs "\N{LATIN SMALL LETTER E WITH ACUTE}",
> which are different ways of writing the same thing.
>

sure, but this is code, written by humans (or meta-programming). Maybe I'm
showing my english bias, but would it be that limiting to have identifiers
be based on codepoints, period?

Why does someone that wants to use, .e.g. "é" in an identifier have to be
able to represent it two different ways in a code file?

But if so ...


> Unfortunately, it goes too far, because it's unlikely that we want "ᵖ"
> ("\N{MODIFIER LETTER SMALL P}') to be equivalent to "P" ("\N{LATIN
> CAPITAL LETTER P}".
>

Is it possible to only capture things like the combining characters and not
the "equivalent" ones like the above?

-CHB

-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QAR3TNRPNW7OXTGWKBDZHNVRKZGMCFZS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

Reply via email to