[issue5876] __repr__ returning unicode doesn't work when called implicitly
Changes by Berker Peksag berker.pek...@gmail.com: -- resolution: fixed - wont fix stage: patch review - resolved ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Armin Rigo added the comment: @Serhiy: it's a behavior change and as such not an option for a micro release. For example, the following legal code would behave differently: it would compute s = '\\u1234' instead of s = 'UTF8:\xe1\x88\xb4'. try: s = repr(x) except UnicodeEncodeError: s = 'UTF8:' + x.value.encode('utf-8') I think I agree that a working repr() is generally better, but in this case it should point to the programmer that they should rather have __repr__() return something sensible and avoid the trick above... -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Serhiy Storchaka added the comment: In Python 3 ascii() uses the backslashreplace error handler. class T: ... def __repr__(self): ... return '\u20ac\udcff' ... print(ascii(T())) \u20ac\udcff I think using the backslashreplace error handler in repr() in Python 2.7 is good solution. Here is a patch. -- keywords: +patch nosy: +serhiy.storchaka stage: test needed - patch review Added file: http://bugs.python.org/file31439/unicode_repr.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
STINNER Victor added the comment: This change is going to break backward compatibility. I don't think that it can be done in Python 2.7.x, and there is no Python 2.8 (PEP 404). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Serhiy Storchaka added the comment: How it can break backward compatibility? Currently repr() just raises UnicodeEncodeError. UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128) With patch it always returns 8-bit string. As far as repr() usually used for debugging the second alternative looks more helpful. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
STINNER Victor added the comment: How it can break backward compatibility? Currently repr() just raises UnicodeEncodeError. It depends on sys.getdefaultencoding() which can be modified in the site module (or in a PYTHONSTARTUP script) using sys.setdefaultencoding(). It should not possible to change the default encoding, and it was fixed in Python 3. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Armin Rigo added the comment: @Serhiy: it would certainly break a program that tries to call the repr() and catches the UnicodeEncodeError to do something else, like encode the data differently. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Marc-Andre Lemburg added the comment: .__repr__() is not really allowed to return Unicode objects in Python 2.x. If you do this, you're on your own. The CPython internals try to convert any non-str object to a str object, but this is only done to assure that PyObject_Repr() always returns a str object. I'd suggest closing this as won't fix. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
STINNER Victor added the comment: I'd suggest closing this as won't fix. Agreed, it's time to upgrade to Python 3! -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Serhiy Storchaka added the comment: It depends on sys.getdefaultencoding() which can be modified in the site module (or in a PYTHONSTARTUP script) using sys.setdefaultencoding(). Of course. Every successful without patch repr() will left same with patch. However the patch allows you to see objects which were not repr-able before. repr() itself is used in the formatting of error messages, so it is desirable extend its aplicability as far as possible. @Serhiy: it would certainly break a program that tries to call the repr() and catches the UnicodeEncodeError to do something else, like encode the data differently. Why it would break? You want encode the data differently.only due non-working repr(), however with proposed patch this will be just not needed. .__repr__() is not really allowed to return Unicode objects in Python 2.x. If you do this, you're on your own. PyObject_Repr() contains a code which converts unicode to str and raise an exception if __repr__() result is not str or unicode. Unicode __repr__() is expected even if it is not recommended. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Marc-Andre Lemburg added the comment: Serhiy Storchaka wrote: .__repr__() is not really allowed to return Unicode objects in Python 2.x. If you do this, you're on your own. PyObject_Repr() contains a code which converts unicode to str and raise an exception if __repr__() result is not str or unicode. Unicode __repr__() is expected even if it is not recommended. True, but the code is not intended to support non-ASCII Unicode, otherwise we would have taken care to introduce support for this much earlier in the 2.x series. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Armin Rigo ar...@users.sourceforge.net added the comment: A __repr__() that returns unicode can, in CPython 2.7 be used in %s % x or in u%s % x --- both expressions then return a unicode without doing any encoding --- but it cannot be used anywhere else, e.g. in %r % x or in repr(x). See also the PyPy issue https://bugs.pypy.org/issue857 . -- nosy: +arigo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Éric Araujo mer...@netwok.org added the comment: I think it’s not an implicit vs. explicit call problem, rather repr vs. str. IIRC, in 2.x it is allowed that __str__ returns a unicode object, and str will convert it to a str. To do that, it will use the default encoding, which is ASCII in 2.5+, so your example cannot work. Ideas for work-arounds: - write a displayhook (http://docs.python.org/dev/library/sys#sys.displayhook) that converts unicode objects using sys.stout.encoding - for 2.6+, test if setting PYTHONIOENCODING changes soemthing -- nosy: +eric.araujo, lemburg versions: -Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
STINNER Victor victor.stin...@haypocalc.com added the comment: I think that this issue is a duplicate of #4947 which has been fixed in Python 2.7.1. Can you retry with Python 2.7.2 (or 2.7.1)? -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Tomasz Melcer li...@o2.pl added the comment: Debian SID. No, it wasn't. Python 2.7.2+ (default, Aug 16 2011, 09:23:59) [GCC 4.6.1] on linux2 Type help, copyright, credits or license for more information. class T(object): ... def __repr__(self): return u'あみご' ... T().__repr__() u'\u3042\u307f\u3054' print T().__repr__() あみご T() Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) print T() Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) import sys sys.stdin.encoding 'UTF-8' sys.stdout.encoding 'UTF-8' -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
STINNER Victor victor.stin...@haypocalc.com added the comment: Debian SID. No, it wasn't. Oh ok, gotcha: repr() always returns a str string. If obj.__repr__() returns a Unicode string, the string is encoded to the default encoding. By default, the default encoding is ASCII. $ ./python -S Python 2.7.2+ (2.7:85a12278de69, Sep 2 2011, 00:21:57) [GCC 4.6.0 20110603 (Red Hat 4.6.0-10)] on linux2 import sys sys.setdefaultencoding('ISO-8859-1') class A(object): ... def __repr__(self): return u'\xe9' ... repr(A()) '\xe9' Don't do that at home! Change the default encoding is not a good idea. I don't think that repr(obj) can be changed to return Unicode if obj.__repr__() returns Unicode. It is too late to change such thing in Python 2. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Changes by Nam Nguyen bits...@gmail.com: -- nosy: +Nam.Nguyen ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
New submission from Tomasz Melcer li...@o2.pl: Invitation... (Debian Sid, gnome-terminal with pl_PL.UTF8 locales) Python 2.5.4 (r254:67916, Feb 17 2009, 20:16:45) [GCC 4.3.3] on linux2 Type help, copyright, credits or license for more information. Lets create some class... class T(object): ... def __repr__(self): return u'あみご' ... Does its repr() work? T().__repr__() u'\u3042\u307f\u3054' print T().__repr__() あみご But when it is implicitly called, it doesnt?! T() Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) print T() Traceback (most recent call last): File stdin, line 1, in module UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) Encoding: import sys sys.stdin.encoding 'UTF-8' sys.stdout.encoding 'UTF-8' Workaround for now: class T(object): ... def __repr__(self): return u'あみご'.encode('utf-8') ... -- components: Extension Modules messages: 86798 nosy: liori severity: normal status: open title: __repr__ returning unicode doesn't work when called implicitly type: behavior versions: Python 2.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5876] __repr__ returning unicode doesn't work when called implicitly
R. David Murray rdmur...@bitdance.com added the comment: This worked in 2.4 and stopped working in 2.5. It's not a problem in 3.x. (2.5 is in security-fix-only mode, so I'm removing it from versions). -- components: +Interpreter Core -Extension Modules nosy: +r.david.murray priority: - normal stage: - test needed versions: +Python 2.6, Python 2.7 -Python 2.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5876 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com