Le vendredi 15 septembre 2006 à 10:48 -0700, Josiah Carlson a écrit :
> This is one of the reasons why I was talking Latin-1, UCS-2, and UCS-4:

You could replace "latin-1" with "one-byte system encoding chosen at
interpreter startup depending on locale".
There are lots of 8-bit encodings other than iso-8859-1.
(for example, my current locale uses iso-8859-15)

The algorithm for choosing the one-byte encoding could be:
- if the current locale uses an one-byte encoding, use that encoding
- otherwise, if current locale language has a popular one-byte encoding
(for many languages this would mean iso-8859-<X>), use that encoding
- otherwise, no one-byte encoding

This would ensure that, for example, Russian text on a system configured
with a Russian locale does not always end up using two bytes per
character internally.

Regards

Antoine.


_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to