On 19/08/2012 19:11, wxjmfa...@gmail.com wrote:
Le dimanche 19 août 2012 19:48:06 UTC+2, Paul Rubin a écrit :
But they are not ascii pages, they are (as stated) MOSTLY ascii.
E.g. the characters are 99% ascii but 1% non-ascii, so 393 chooses
a much more memory-expensive encoding than UTF-8.
Imagine an us banking application, everything in ascii,
except ... the € currency symbole, code point 0x20ac.
Well, it seems some software producers know what they
are doing.
'€'.encode('cp1252')
b'\x80'
'€'.encode('mac-roman')
b'\xdb'
'€'.encode('iso-8859-1')
Traceback (most recent call last):
File "<eta last command>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac'
in position 0: ordinal not in range(256)
jmf
Well that's it then, the world stock markets will all collapse tonight
when the news leaks out that those stupid Americans haven't yet realised
that much of Europe (with at least one very noticeable and sensible
exception :) uses Euros. I'd better sell all my stock holdings fast.
--
Cheers.
Mark Lawrence.
--
http://mail.python.org/mailman/listinfo/python-list