Am 18.01.2012 17:01, schrieb PJ Eby: > On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" <mar...@v.loewis.de > <mailto:mar...@v.loewis.de>> wrote: > > Am 17.01.2012 22:26, schrieb Antoine Pitrou: > > Only 2 bits are used in ob_sstate, meaning 30 are left. These 30 bits > > could cache a "hash perturbation" computed from the string and the > > random bits: > > > > - hash() would use ob_shash > > - dict_lookup() would use ((ob_shash * 1000003) ^ (ob_sstate & ~3)) > > > > This way, you cache almost all computations, adding only a computation > > and a couple logical ops when looking up a string in a dict. > > That's a good idea. For Unicode, it might be best to add another slot > into the object, even though this increases the object size. > > > Wouldn't that break the ABI in 2.x?
I was thinking about adding the field at the end, so I thought it shouldn't. However, if somebody inherits from PyUnicodeObject, it still might - so my new proposal is to add the extra hash into the str block, either at str[-1], or after the terminating 0. This would cause an average increase of four bytes of the storage (0 bytes in 50% of the cases, 8 bytes because of padding in the other 50%). What do you think? Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com