On 11 jan, 01:56, Terry Reedy <tjre...@udel.edu> wrote: > On 1/10/2012 8:43 AM, jmfauth wrote: > > ... > > mbcs encodes according to the current codepage. Only the chinese > codepage(s) can encode the chinese char. So the unicode error is correct > and 2.7 has a bug in that it is doing "errors='replace'" when it > supposedly is doing "errors='strict'". The Py3 fix was done > inhttp://bugs.python.org/issue850997 > 2.7 was intentionally left alone because of back-compatibility > considerations. (None of this addresses the OP's question.) > > --
win7, cp1252 Ok. I was not aware of this. >>> '\N{CYRILLIC SMALL LETTER A}'.encode('mbcs') Traceback (most recent call last): File "<eta last command>", line 1, in <module> UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character >>> '\N{GREEK SMALL LETTER ALPHA}'.encode('mbcs') Traceback (most recent call last): File "<eta last command>", line 1, in <module> UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character jmf -- http://mail.python.org/mailman/listinfo/python-list