[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

Inada Naoki Mon, 03 Aug 2020 20:18:05 -0700

Hi, Lemburg.

Thank you for organizing the EuroPython 2020.
I enjoyed watching some sessions from home.


I think current PEP 624 covers all your points and ready for Steering
Council discussion.
Would you like to review the PEP before it?

Regards,


On Thu, Jul 9, 2020 at 8:19 AM Inada Naoki <[email protected]> wrote:
>
> On Thu, Jul 9, 2020 at 5:46 AM M.-A. Lemburg <[email protected]> wrote:
> > - the fact that the encode APIs encoding from a Unicode buffer
> >   to a bytes object; this is an important fact, since the removal
> >   removes access to this codec functionality for extensions
> >
> > - PyUnicode_AsEncodedString() is not a proper alternative, since
> >   it requires to create a temporary PyUnicode object, which is
> >   inefficient and wastes memory
>
> I wrote your points in the "Alternative Idea > Replace Py_UNICODE*
> with Py_UCS4* "
> section. I wrote "User can encode UCS-4 string in C without creating
> Unicode object." in it.
>
> https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-py-ucs4
>
> Note that the current Py_UNICODE* encoder APIs create temporary
> PyUnicode objects.
> They are inefficient and wastes memory now. Py_UNICODE* may be UTF-16 on some
> platforms (e.g. Windows) and builtin codecs don't support UTF-16 input.
>
>
> >
> > - the maintenance effect mentioned in the PEP does not really
> >   materialize, since the underlying functionality still exists
> >   in the codecs - only access to the functionality is removed
> >
>
> In the same section, I described the maintenance cost as below.
>
> * Other Python implementations may not have builtin codec for UCS-4.
> * If we change the Unicode internal representation to UTF-8, we need
> to keep UCS-4 support only for these APIs.
>
> > - keeping just the generic PyUnicode_Encode() API would be a
> >   compromise
> >
> > - if we remove the codec specific PyUnicode_Encode*() APIs, why
> >   are we still keeping the specisl PyUnicde_Decode*() APIs ?
> >
>
> OK, I will add "Discussions" section. (I don't like "FAQ" because some 
> question
> are important even if it is not "frequently" asked.)
>
> Quick answer is:
>
> * They are stable ABI. (Py_UNICODE is excluded from stable ABI).
> * Decoding from char* is more common and generic use case than encoding from
>   Py_UNICODE*.
> * Other Python implementations using UTF-8 as internal representation
> can implement
>   it easily.
>
> But I'm not opposite to remove it (especially for minor UTF-7 codec).
> It is just out of scope of this PEP.
>
>
> > - the deprecations were just done because the Py_UNICODE data
> >   type was replaced by a hybrid type. Using this as an argument
> >   for removing functionality is not really good practice, when
> >   these are ways to continue exposing the functionality using other
> >   data types.
>
> I hope the "Replace Py_UNICODE* with Py_UCS4* " section describe this.
>
> Regards,
>
> --
> Inada Naoki  <[email protected]>



-- 
Inada Naoki  <[email protected]>
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/LXS6SXGX3HADR2GHWWC3C4Q3UGN4M2CR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

Reply via email to