On Tue, Feb 2, 2021 at 12:43 AM M.-A. Lemburg <m...@egenix.com> wrote:
>
> Hi Inada-san,
>
> thank you for adding some comments, but they are not really capturing
> what I think is missing:
>
> """
> Removing these APIs removes ability to use codec without temporary Unicode.
>
>     Codecs can not encode Unicode buffer directly without temporary Unicode
> object since Python 3.3. All these APIs creates temporary Unicode object for
> now. So removing them doesn't reduce any abilities.
> """
>
> The point is that while the decoders allow going from a C object
> to a Python object directly, we are missing a way to do the same
> for the encoders, since the Python 3.3 change in the Unicode internals.
>
> At the very least, we should have such APIs for going from wchar_t*
> to a Python object.

We already have PyUnicode_FromWideChar(). So I assume you mean
"wchar_t* to Python bytes object".

>
> The alternatives you provide all require creating an intermediate
> Python object for this purpose. The APIs you want to remove do that
> as well, but that's not the point. The point is to expose the codecs'
> decode mechanism which is available in the C code, but currently
> not exposed via C APIs, e.g. ucs4lib_utf8_encode().
>
> It would be breaking change, but those APIs in your list could
> simply be changed from using Py_UNICODE to using whcar_t instead
> and then interface directly to the internal functions we have for
> the encoders.
>

OK, I see codecs.h has three encoders.

* utf8_encode
* utf16_encode
* utf32_encode

But there are 13 encoders in my PEP:

PyUnicode_Encode()
PyUnicode_EncodeASCII()
PyUnicode_EncodeLatin1()
PyUnicode_EncodeUTF7()
PyUnicode_EncodeUTF8()
PyUnicode_EncodeUTF16()
PyUnicode_EncodeUTF32()
PyUnicode_EncodeUnicodeEscape()
PyUnicode_EncodeRawUnicodeEscape()
PyUnicode_EncodeCharmap()
PyUnicode_TranslateCharmap()
PyUnicode_EncodeDecimal()
PyUnicode_TransformDecimalToASCII()

Do you want to keep all encoders? or 3 encoders?


> That would keep extensions working after a recompile, since
> Py_UNICODE is already a typedef to wchar_t.
>

That idea is written in the PEP already.
https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t

Regards,
-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/USUH2YDEXW64NQYGJPG2OOLEJS3NJLXG/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to