Hi, Lemburg. Thank you for organizing the EuroPython 2020. I enjoyed watching some sessions from home.
I think current PEP 624 covers all your points and ready for Steering Council discussion. Would you like to review the PEP before it? Regards, On Thu, Jul 9, 2020 at 8:19 AM Inada Naoki <songofaca...@gmail.com> wrote: > > On Thu, Jul 9, 2020 at 5:46 AM M.-A. Lemburg <m...@egenix.com> wrote: > > - the fact that the encode APIs encoding from a Unicode buffer > > to a bytes object; this is an important fact, since the removal > > removes access to this codec functionality for extensions > > > > - PyUnicode_AsEncodedString() is not a proper alternative, since > > it requires to create a temporary PyUnicode object, which is > > inefficient and wastes memory > > I wrote your points in the "Alternative Idea > Replace Py_UNICODE* > with Py_UCS4* " > section. I wrote "User can encode UCS-4 string in C without creating > Unicode object." in it. > > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-py-ucs4 > > Note that the current Py_UNICODE* encoder APIs create temporary > PyUnicode objects. > They are inefficient and wastes memory now. Py_UNICODE* may be UTF-16 on some > platforms (e.g. Windows) and builtin codecs don't support UTF-16 input. > > > > > > - the maintenance effect mentioned in the PEP does not really > > materialize, since the underlying functionality still exists > > in the codecs - only access to the functionality is removed > > > > In the same section, I described the maintenance cost as below. > > * Other Python implementations may not have builtin codec for UCS-4. > * If we change the Unicode internal representation to UTF-8, we need > to keep UCS-4 support only for these APIs. > > > - keeping just the generic PyUnicode_Encode() API would be a > > compromise > > > > - if we remove the codec specific PyUnicode_Encode*() APIs, why > > are we still keeping the specisl PyUnicde_Decode*() APIs ? > > > > OK, I will add "Discussions" section. (I don't like "FAQ" because some > question > are important even if it is not "frequently" asked.) > > Quick answer is: > > * They are stable ABI. (Py_UNICODE is excluded from stable ABI). > * Decoding from char* is more common and generic use case than encoding from > Py_UNICODE*. > * Other Python implementations using UTF-8 as internal representation > can implement > it easily. > > But I'm not opposite to remove it (especially for minor UTF-7 codec). > It is just out of scope of this PEP. > > > > - the deprecations were just done because the Py_UNICODE data > > type was replaced by a hybrid type. Using this as an argument > > for removing functionality is not really good practice, when > > these are ways to continue exposing the functionality using other > > data types. > > I hope the "Replace Py_UNICODE* with Py_UCS4* " section describe this. > > Regards, > > -- > Inada Naoki <songofaca...@gmail.com> -- Inada Naoki <songofaca...@gmail.com> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LXS6SXGX3HADR2GHWWC3C4Q3UGN4M2CR/ Code of Conduct: http://python.org/psf/codeofconduct/