On 19/08/2012 19:11, wxjmfa...@gmail.com wrote:
Le dimanche 19 août 2012 19:48:06 UTC+2, Paul Rubin a écrit :


But they are not ascii pages, they are (as stated) MOSTLY ascii.

E.g. the characters are 99% ascii but 1% non-ascii, so 393 chooses

a much more memory-expensive encoding than UTF-8.



Imagine an us banking application, everything in ascii,
except ... the € currency symbole, code point 0x20ac.

Well, it seems some software producers know what they
are doing.

'€'.encode('cp1252')
b'\x80'
'€'.encode('mac-roman')
b'\xdb'
'€'.encode('iso-8859-1')
Traceback (most recent call last):
   File "<eta last command>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac'
in position 0: ordinal not in range(256)

jmf


Well that's it then, the world stock markets will all collapse tonight when the news leaks out that those stupid Americans haven't yet realised that much of Europe (with at least one very noticeable and sensible exception :) uses Euros. I'd better sell all my stock holdings fast.

--
Cheers.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to