On Apr 3, 5:56 pm, Peter Otten <[EMAIL PROTECTED]> wrote: > WaterWalk wrote: > > Hello. I just found on Windows when an exception is raised and > > traceback info is printed on STDERR, all the characters printed are > > just plain ASCII. Take the unicode character u'\u4e00' for example. If > > I write: > > > print u'\u4e00' > > > If the system locale is "PRC China", then this statement will print > > this character as a single Chinese character. > > > But if i write: assert u'\u4e00' == 1 > > > An AssertionError will be raised and traceback info will be put to > > STDERR, while this time, u'\u4e00' will simply be printed just as > > u'\u4e00', several ASCII characters instead of one single Chinese > > character. I use the coding directive commen(# -*- coding: utf-8 -*-)t > > on the first line of Python source file and also save it in utf-8 > > format, but the problem remains. > > > What's worse, if i directly write Chinese characters in a unicode > > string, when the traceback info is printed, they'll appear in a non- > > readable way, that is, they show themselves as something else. It's > > like printing something DBCS characters when the locale is incorrect. > > > I think this problem isn't unique. When using some other East-Asia > > characters, the same problem may recur. > > > Is there any workaround to it? > > Pass a byte string but make some effort to use the right encoding: > > >>> assert False, u"\u4e00".encode(sys.stdout.encoding or "ascii", > >>> "xmlcharrefreplace") > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > AssertionError: 一 > > You might be able to do this in the except hook: > > $ cat unicode_exception_message.py > import sys > > def eh(etype, exc, tb, original_excepthook=sys.excepthook): > message = exc.args[0] > if isinstance(message, unicode): > exc.args = (message.encode(sys.stderr.encoding or "ascii", > "xmlcharrefreplace"),) + exc.args[1:] > return original_excepthook(etype, exc, tb) > > sys.excepthook = eh > > assert False, u"\u4e00" > > $ python unicode_exception_message.py > Traceback (most recent call last): > File "unicode_exception_message.py", line 11, in <module> > assert False, u"\u4e00" > AssertionError: 一 > > If python cannot figure out the encoding this falls back to ascii with > xml charrefs: > > $ python unicode_exception_message.py 2>tmp.txt > $ cat tmp.txt > Traceback (most recent call last): > File "unicode_exception_message.py", line 11, in <module> > assert False, u"\u4e00" > AssertionError: 一 > > Note that I've not done any tests; e.g. if there are exceptions with > immutable .args the except hook itself will fail. > > Peter
Thanks. My brief test indicates that it works. I'll try it in more situations.
-- http://mail.python.org/mailman/listinfo/python-list