Re: [Python-Dev] Status of the fix for the hash collision vulnerability

Steven D'Aprano Fri, 13 Jan 2012 18:57:22 -0800

On 14/01/12 12:58, Gregory P. Smith wrote:

I do like *randomly seeding the hash*. *+1*. This is easy. It can easily be
back ported to any Python version.


It is perfectly okay to break existing users who had anything depending on
ordering of internal hash tables. Their code was already broken.


For the record:

steve@runes:~$ python -c "print(hash('spam ham'))"
-376510515
steve@runes:~$ jython -c "print(hash('spam ham'))"
2054637885

So it is already the case that Python code that assumes stable hashing is 
broken.

For what it's worth, I'm not convinced that we should be overly-concerned by"poor saps" (Guido's words) who rely on accidents of implementation regardinghash. We shouldn't break their code unless we have a good reason, but thisstrikes me as a good reason. The documentation for hash certainly makes nopromise about stability, and relying on it strikes me as about as sensible asrelying on the stability of error messages.

I'm also not convinced that the option to raise an exception after 1000collisions actually solves the problem. That relies on the application beingre-written to catch the exception and recover from it (how?). Otherwise, allit does is change the attack vector from "cause an indefinite number of hashcollisions" to "cause 999 hash collisions followed by crashing the applicationwith an exception", which doesn't strike me as much of an improvement.

+1 on random seeding. Default to on in 3.3+ and default to off in olderversions, which allows people to avoid breaking their code until they're readyfor it to be broken.




--
Steven
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

Reply via email to