On Dec 13, 2007 6:03 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > > In Python 2.x, having the byte string and unicode hash equally was > > desirable, since u'' == ''. But since the bytes and str are always > > considered unequal, in Python 3k, I think would be good idea to make > > their hash unequal too. So, what do you think? > > To phrase Adam Olsen's observation in a different way: *Why* do you > think it would be good idea? Do you think it would make things more > correct, or more efficient? If neither, what other desirable effect > would that change have? >
I first thought that would avoid the somehow odd behavior that appears when mixing unicode and byte strings in dictionaries: >>> d = {} >>> d = {'spam': 0} >>> d[u'spam'] = 1 >>> d {'spam': 1} But then, I realized this wasn't a problem anymore, in Python 3k, since unicode string (str) and byte string (bytes) are always unequal. However, that is not why I proposed to make the hashes unequal. I was worry that people would be tempted to use this equality property as an easy way (but wrong) to compare strings: >>> hash('hello') == hash(b'hello') True I do realize now that it is really a weak argument. And, I don't think anymore that it justifies changing the hashing functions. -- Alexandre _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com