UCS-2 means units of 16 bits so it's limited to Unicode BMP: U+0000-U+FFFF.
UCS-4 means units of 32 bits and so gives access to the whole (current) Unicode character set. Do you mean UTF-16 and UTF-32? UTF-16 supports the whole Unicode character set but uses the annoying surrogate pairs for characters outside the BMP.* UTF-32 is UCS-4 in practice. Victor Le jeu. 2 juil. 2020 à 15:08, Barry Scott <[email protected]> a écrit : > > > > On 30 Jun 2020, at 13:43, Emily Bowman <[email protected]> wrote: > > I completely agree with this, that UTF-8 has become the One True > Encoding(tm), and UCS-2 and UTF-16 are hardly found anywhere outside of the > Win32 API. Nearly all basic emoji can't be represented in UCS-2 wchar_t, let > alone composite emoji. > > > I use UCS-32 in my extensions, but never persist UCS-32 for which I use UTF-8. > > If you are calling WIN32 "unicode" APIs then you need UCS-16. > > My plan with PyCXX is to replace Py_UNICODE with UCS-32. > I think all the UCS-32 APIs will still be present. > > Once I add that support to PyCXX all my users should easily port to a > non-Py_UNICODE world. > > Barry > > _______________________________________________ > Python-Dev mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at > https://mail.python.org/archives/list/[email protected]/message/YIKT5XGPZIMEIAPBJS3OQAZTWW4JM3Z2/ > Code of Conduct: http://python.org/psf/codeofconduct/ -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/K5MKE6EDM7HKAGFXQ4EYWKACDX6OCFFH/ Code of Conduct: http://python.org/psf/codeofconduct/
