On 11 jan, 01:56, Terry Reedy <tjre...@udel.edu> wrote: > On 1/10/2012 8:43 AM, jmfauth wrote: > > > > > D:\>c:\python32\python.exe > > Python 3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit > > (Intel)] on win > > 32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> '\u5de5'.encode('utf-8') > > b'\xe5\xb7\xa5' > >>>> '\u5de5'.encode('mbcs') > > Traceback (most recent call last): > > File "<stdin>", line 1, in<module> > > UnicodeEncodeError: 'mbcs' codec can't encode characters in position > > 0--1: inval > > id character > > D:\>c:\python27\python.exe > > Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit > > (Intel)] on win > > 32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> u'\u5de5'.encode('utf-8') > > '\xe5\xb7\xa5' > >>>> u'\u5de5'.encode('mbcs') > > '?' > > mbcs encodes according to the current codepage. Only the chinese > codepage(s) can encode the chinese char. So the unicode error is correct > and 2.7 has a bug in that it is doing "errors='replace'" when it > supposedly is doing "errors='strict'". The Py3 fix was done > inhttp://bugs.python.org/issue850997 > 2.7 was intentionally left alone because of back-compatibility > considerations. (None of this addresses the OP's question.) > > --
Ok. I was not aware of this. PS Prev. post gets lost. -- http://mail.python.org/mailman/listinfo/python-list