Re: [Python-Dev] Hash collision security issue (now public)

Christian Heimes Fri, 30 Dec 2011 19:26:12 -0800

Am 31.12.2011 03:31, schrieb Victor Stinner:
> Le 29/12/2011 14:19, Christian Heimes a écrit :
>> Perhaps the dict code is a better place for randomization.
> 
> The problem is the creation of a dict with keys all having the same hash 
> value. The current implementation of dict uses a linked-list. Adding a 
> new item requires to compare the new key to all existing keys (compare 
> the value, not the hash, which is much slower).
> 
> We had to change completly how dict is implemented to be able to fix 
> this issue. I don't think that we can change the dict implementation 
> without breaking backward compatibility or breaking applications. Change 
> the implementation would change dict properties, and applications rely 
> on the properties of the current implementation.
> 
> Tell me if I am wrong.


You are right and I was wrong. We can't do the randomization inside the
dict code. The randomization factor must used as initialization factor
(IV) for the hashing algorithm. At first I thought the attack used the
unique property of Python's dict implementation (perturbed hash instead
of buckets for equal hashes) but I was wrong. It took me several hours
of reading and digging into the math until I figured out my mistake.
Sorry! :)

This means we can't implement a salted dict unless the salted dict
re-implemention the hash algorithm for unicode, bytes and memoryview. I
doubt that a wise idea.

I've checked my first draft of a possible solution:
http://hg.python.org/features/randomhash/ . The pseudo RNG has to be
replaced with something better and it's missing an option to feed a
seed, too.

Christian
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hash collision security issue (now public)

Reply via email to