New submission from Ezio Melotti <ezio.melo...@gmail.com>: In Py2.x >>> u'\2620' outputs u'\2620' whereas >>> print u'\2620' raises an error.
Instead, in Py3.x, both >>> '\u2620' and >>> print('\u2620') raise an error if the terminal doesn't use an encoding able to display the character (e.g. the windows terminal used for these examples). This is caused by the new string representation defined in the PEP3138[1]. Consider also the following example: Py2: >>> [u'\u2620'] [u'\u2620'] Py3: >>> ['\u2620'] UnicodeEncodeError: 'charmap' codec can't encode character '\u2620' in position 9: character maps to <undefined> This means that there is no way to print lists (or other objects) that contain characters that can't be encoded. Two workarounds may be: 1) encode all the elements of the list, but it's not practical; 2) use ascii(), but it adds extra "" around the output and escape backslashes and apostrophes (and it won't be possible to use _[0] in the next line). Also note that in Py3 >>> ['\ud800'] ['\ud800'] >>> _[0] '\ud800' works, because U+D800 belongs to the category "Cs (Other, Surrogate)" and it is escaped[2]. The best solution is probably to change the default error-handler of the Python3 interactive interpreter to 'backslashreplace' in order to avoid this behavior, but I don't know if it's possible only for ">>> foo" and not for ">>> print(foo)" (print() should still raise an error as it does in Py2). This proposal has already been refused in the PEP3138[3] but there are no links to the discussion that led to this decision. I think this should be rediscussed and possibly changed, because, even if can't see the "listOfJapaneseStrings"[4], I still prefer to see a sequence of escaped chars than a UnicodeEncodeError. [1]: http://www.python.org/dev/peps/pep-3138/ [2]: http://www.python.org/dev/peps/pep-3138/#specification [3]: http://www.python.org/dev/peps/pep-3138/#rejected-proposals [4]: http://www.python.org/dev/peps/pep-3138/#motivation ---------- components: Unicode messages: 80820 nosy: ezio.melotti severity: normal status: open title: Printing Unicode chars from the interpreter in a non-UTF8 terminal (Py3) type: behavior versions: Python 3.0 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5110> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com