On Thu, Jul 2, 2020 at 5:20 AM M.-A. Lemburg <m...@egenix.com> wrote:
>
>
> The reasoning here is the same as for decoding: you have the original
> data you want to process available in some array and want to turn
> this into the Python object.
>
> The path Victor suggested requires always going via a Python Unicode
> object, but that it very expensive and not really an appropriate
> way to address the use case.
>

But current PyUnicode_Encode* APIs does `PyUnicode_FromWideChar`.
It is no direct API already.

Additionally, pyodbc, the only user of the encoder API, did
PyUnicode_EncodeUTF16(PyUnicode_AsUnicode(unicode), ...)
It is very inefficient.  Unicode Object -> Py_UNICODE* -> Unicode
Object -> byte object.

And as many others already said, most C world use UTF-8 for Unicode
representation in C,
not wchar_t.

So I don't want to undeprecate current API.


> As an example application, think of a database module which provides
> the Unicode data as Py_UNICODE buffer.

Py_UNICODE is deprecated.  So I assume you are talking about wchar_t.


> You want to write this as UTF-8
> data to a file or a socket, so you have the PyUnicode_EncodeUTF8() API
> decode this for you into a bytes object which you can then write out
> using the Python C APIs for this.

PyUnicode_FromWideChar + PyUnicode_AsUTF8AndSize is better than
PyUnicode_EncodeUTF8.

PyUnicode_EncodeUTF8 allocate temporary Unicode object anyway. So it needs
to allocate Unicode object *and* char* buffer for UTF-8.
On the other hand, PyUnicode_AsUTF8AndSize can just expose internal
data when it is plain ASCII. Since ASCII string is very common, this
is effective
optimization.

Regards,
-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UYOPQDKLSNOVPFGPCR5BIW3GHYB3V3KZ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to