On 7/15/2018 5:28 PM, Marko Rauhamaa wrote:

if your new system used Python3's UTF-32 strings as a foundation,

Since 3.3, Python's strings are not (always) UFT-32 strings. Nor are they always UCS-2 (or partly UTF-16) strings. Nor are the always Latin-1 or Ascii strings. Python's Flexible String Representation uses the narrowest possible internal code for any particular string. This is all transparent to the user except for memory size.

In 3.2 and before, Python's Unicode strings were either wide (UFT-32) or narrow (UCS-2 + surrogates or UFT-16 minus full compliance). The difference was sometimes not transparent, and code that worked on one build could fail on the other. Since 3.3, string code should work the same on any machines running the same Python version.

UTF-32, after all, is a variable-width encoding.

Nope.  It a fixed-width (32 bits, 4 bytes) encoding.

Perhaps you should ask more questions before pontificating.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to