Bugs item #1564763, was opened at 2006-09-24 23:43 Message generated for change (Comment added) made by arigo You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1564763&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Unicode Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: Joe Wreschnig (piman) Assigned to: M.-A. Lemburg (lemburg) Summary: Unicode comparison change in 2.4 vs. 2.5 Initial Comment: Python 2.5 changed the behavior of unicode comparisons in a significant way from Python 2.4, causing a test case failure in a module of mine. All tests passed with an earlier version of 2.5, though unfortunately I don't know what version in particular it started failing with. The following code prints out all True on Python 2.4; the strings are compared case-insensitively, whether they are my lowerstr class, real strs, or unicodes. On Python 2.5, the comparison between lowerstr and unicode is false, but only in one direction. If I make lowerstr inherit from unicode rather than str, all comparisons are true again. So at the very least, this is internally inconsistent. I also think changing the behavior between 2.4 and 2.5 constitutes a serious bug. ---------------------------------------------------------------------- >Comment By: Armin Rigo (arigo) Date: 2006-09-25 21:11 Message: Logged In: YES user_id=4771 This is an artifact of the change in the unicode class, which now has the proper __eq__, __ne__, __lt__, etc. methods instead of the semi-deprecated __cmp__. The mixture of __cmp__ and the other methods is not very well-defined. This is why your code worked in 2.4: a bit by chance. Indeed, in theory it should not, according to the language reference. So what I am saying is that although it is a behavior change from 2.4 to 2.5, I would argue that it is not a bug but a bug fix... The reason is that if we ignore the __eq__ vs __cmp__ issues, the operation 'a == b' is defined as: Python tries a.__eq__(b); if this returns NotImplemented, then Python tries b.__eq__(a). As an exception, if type(b) is a strict subclass of type(a), then Python tries in the other order. This is why you get the 2.5 behavior: if lowerstr inherits from str, it is not a subclass of unicode, so u'abc' == lowerstr() tries u'abc'.__eq__(), which works immediately. On the other hand, if lowerstr inherits from unicode, then Python tries first lowerstr().__eq__(u'abc'). This part of the Python object model - when to reverse the order or not - is a bit obscure and not completely helpful... Subclassing built-in types generally only works a bit. In your situation you should use a regular class that behaves in a string-like fashion, with an __eq__() method doing the case-insensitive comparison... if you can at all - there are places where you need a real string, so this "solution" might not be one either, but I don't see a better one :-( ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1564763&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com