Re: [Python-3000] Performance Notes - new hash algorithm

Thomas Wouters Sun, 09 Sep 2007 02:13:08 -0700

On 9/9/07, Larry Hastings <[EMAIL PROTECTED]> wrote:

> One goal of Jenkin's hashes is uniform distribution, so these functions
> presumably lack the serendipitous "similar inputs hash to similar values"
> behavior of Python's current hash function.  But why is that a feature?
> (Not that I doubt Tim Peters!)
>


Because (relatively) small dicts with (broadly speaking) similar keys are
quite common in Python. Module and class and instance __dict__s, for
instance ;) As Tim mentioned, the dict implementation only looks at part of
the actual hash value (depending on the size of the dict) and having hash
values close but not the same greatly decreases the chance of collisions in
(relatively) small dicts. It's less of a problem for massive dicts with
(almost) completely arbitrary keys, but it doesn't exactly hurt there,
either.

-- 
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!

_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] Performance Notes - new hash algorithm

Reply via email to