On Sun, Jan 1, 2012 at 10:00 PM, PJ Eby <p...@telecommunity.com> wrote: > On Sun, Jan 1, 2012 at 7:37 PM, Jim Jewett <jimjjew...@gmail.com> wrote:
>> Well, there is nothing wrong with switching to a different hash function >> after N >> collisions, rather than "in the first place". The perturbation >> effectively does by >> shoving the high-order bits through the part of the hash that survives the >> mask. > Since these are true hash collisions, they will all have the same high order > bits. So, the usefulness of the perturbation is limited mainly to the > common case where true collisions are rare. That is only because the perturb is based solely on the hash. Switching to an entirely new hash after the 5th collision (for a given lookup) would resolve that (after the 5th collision); the question is whether or not the cost is worthwhile. >> > (Well, technically, you could use trees or some other O log n data >> > structure as a fallback once you have too many collisions, for some >> > value >> > of "too many". Seems a bit wasteful for the purpose, though.) >> >> Your WSGI specification < http://www.python.org/dev/peps/pep-0333/ > >> requires >> using a real dictionary for compatibility; storing some of the values >> outside the >> values array would violate that. > When I said "use some other data structure", I was referring to the internal > implementation of the dict type, not to user code. The only user-visible > difference (even at C API level) would be the order of keys() et al. Given the wording requiring a real dictionary, I would have assumed that it was OK (if perhaps not sensible) to do pointer arithmetic and access the keys/values/hashes directly. (Though if the breakage was between python versions, I would feel guilty about griping too loudly.) -jJ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com