Le vendredi 15 septembre 2006 à 10:48 -0700, Josiah Carlson a écrit : > This is one of the reasons why I was talking Latin-1, UCS-2, and UCS-4:
You could replace "latin-1" with "one-byte system encoding chosen at interpreter startup depending on locale". There are lots of 8-bit encodings other than iso-8859-1. (for example, my current locale uses iso-8859-15) The algorithm for choosing the one-byte encoding could be: - if the current locale uses an one-byte encoding, use that encoding - otherwise, if current locale language has a popular one-byte encoding (for many languages this would mean iso-8859-<X>), use that encoding - otherwise, no one-byte encoding This would ensure that, for example, Russian text on a system configured with a Russian locale does not always end up using two bytes per character internally. Regards Antoine. _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
