On Thu, Jul 2, 2020 at 5:20 AM M.-A. Lemburg <m...@egenix.com> wrote: > > > The reasoning here is the same as for decoding: you have the original > data you want to process available in some array and want to turn > this into the Python object. > > The path Victor suggested requires always going via a Python Unicode > object, but that it very expensive and not really an appropriate > way to address the use case. >
But current PyUnicode_Encode* APIs does `PyUnicode_FromWideChar`. It is no direct API already. Additionally, pyodbc, the only user of the encoder API, did PyUnicode_EncodeUTF16(PyUnicode_AsUnicode(unicode), ...) It is very inefficient. Unicode Object -> Py_UNICODE* -> Unicode Object -> byte object. And as many others already said, most C world use UTF-8 for Unicode representation in C, not wchar_t. So I don't want to undeprecate current API. > As an example application, think of a database module which provides > the Unicode data as Py_UNICODE buffer. Py_UNICODE is deprecated. So I assume you are talking about wchar_t. > You want to write this as UTF-8 > data to a file or a socket, so you have the PyUnicode_EncodeUTF8() API > decode this for you into a bytes object which you can then write out > using the Python C APIs for this. PyUnicode_FromWideChar + PyUnicode_AsUTF8AndSize is better than PyUnicode_EncodeUTF8. PyUnicode_EncodeUTF8 allocate temporary Unicode object anyway. So it needs to allocate Unicode object *and* char* buffer for UTF-8. On the other hand, PyUnicode_AsUTF8AndSize can just expose internal data when it is plain ASCII. Since ASCII string is very common, this is effective optimization. Regards, -- Inada Naoki <songofaca...@gmail.com> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UYOPQDKLSNOVPFGPCR5BIW3GHYB3V3KZ/ Code of Conduct: http://python.org/psf/codeofconduct/