Paul McMillan <p...@mcmillan.ws> added the comment:

This is not something that can be fixed by limiting the size of POST/GET. 

Parsing documents (even offline) can generate these problems. I can create 
books that calibre (a Python-based ebook format shifting tool) can't convert, 
but are otherwise perfectly valid for non-python devices. If I'm allowed to 
insert usernames into a database and you ever retrieve those in a dict, you're 
vulnerable. If I can post things one at a time that eventually get parsed into 
a dict (like the tag example), you're vulnerable. I can generate web traffic 
that creates log files that are unparsable (even offline) in Python if dicts 
are used anywhere. Any application that accepts data from users needs to be 
considered.

Even if the web framework has a dictionary implementation that randomizes the 
hashes so it's not vulnerable, the entire python standard library uses dicts 
all over the place. If this is a problem which must be fixed by the framework, 
they must reinvent every standard library function they hope to use.

Any non-trivial python application which parses data needs the fix. The entire 
standard library needs the fix if is to be relied upon by applications which 
accept data. It makes sense to fix Python.

Of course we must fix all the basic hashing functions in python, not just the 
string hash. There aren't that many. 

Marc-Andre:
If you look at my proposed code, you'll notice that we do more than simply 
shift the period of the hash. It's not trivial for an attacker to create 
colliding hash functions without knowing the key.

Since speed is a concern, I think that the proposal to avoid using the random 
hash for short strings is a good idea. Additionally, randomizing only some of 
the characters in longer strings will allow us to improve security without 
compromising speed significantly.

I suggest that we don't randomize strings shorter than 6 characters. For longer 
strings, we randomize the first and last 5 characters. This means we're only 
adding additional work to a max of 10 rounds of the hash, and only for longer 
strings. Collisions with the hash from short strings should be minimal.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to