[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

Inada Naoki Tue, 02 Feb 2021 03:47:49 -0800

On Tue, Feb 2, 2021 at 7:37 PM M.-A. Lemburg <m...@egenix.com> wrote:
>
> >> That would keep extensions working after a recompile, since
> >> Py_UNICODE is already a typedef to wchar_t.
> >>
> >
> > That idea is written in the PEP already.
> > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t
>
> Right and I think this is a more workable approach than removing
> APIs.
>
> BTW: I don't understand this comment:
> "They are inefficient on platforms wchar_t* is UTF-16. It is because
> built-in codecs supports only UCS-1, UCS-2, and UCS-4 input."
>
> Windows is one such platform. Java (indirectly) is another. They both
> store UTF-16LE in those arrays and Python's codecs handle this just
> fine.
>


I'm sorry about the section is not clear.

For example, if wchar_t* is UCS4, ucs4_utf8_encoder() can encode
wchar_t* into UTF-8.

But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
surrogate escape.
We need to use a temporary Unicode object. That is what "inefficient" means.

I will update the section more elaborate.

Regards,
-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUGBVLQNBFVNX25AEIL77WSFOHQES6LJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

Reply via email to