On Thu, Jan 19, 2012 at 7:32 PM, Ivan Kozik <i...@ludios.org> wrote: > On Fri, Jan 20, 2012 at 00:48, Victor Stinner > <victor.stin...@haypocalc.com> wrote: > > I propose to solve the hash collision vulnerability by counting > > collisions because it does fix the vulnerability with a minor or no > > impact on applications or backward compatibility. I don't see why we > > should use a different fix for Python 3.3. If counting collisons > > solves the issue for stable versions, it is also enough for Python > > 3.3. We now know all issues of the randomized hash solution, and I > > think that there are more drawbacks than advantages. IMO the > > randomized hash is overkill to fix the hash collision issue. > > I'd like to point out that an attacker is not limited to sending just > one dict full of colliding keys. Given a 22ms stall for a dict full > of 1000 colliding keys, and 100 such objects inside a parent object > (perhaps JSON), you can stall a server for 2.2+ seconds. Going with > the raise-at-1000 approach doesn't solve the problem for everyone. >
It's "just" a DoS attack. Those won't go away. We just need to raise the effort needed for the attacker. The original attack would cause something like 5 minutes of CPU usage per request (with a set of colliding keys that could be computed once and used to attack every Python-run website in the world). That's at least 2 orders of magnitude worse. In addition, because the raise-at-N-collisions approach raises an > exception, everyone who wants to handle this error condition properly > has to change their code to catch a previously-unexpected exception. > (I know they're usually still better off with the fix, but why force > many people to change code when you can actually fix the hashing > problem?) > Why would anybody need to change their code? Every web framework worth its salt has a top-level error catcher that logs the error, serves a 500 response, and possibly does other things like email the admin. > Another issue is that even with a configurable limit, different > modules can't have their own limits. One module might want a > relatively safe raise-at-100, and another module creating massive > dicts might want raise-at-1000. How does a developer know whether > they can raise or lower the limit, given that they use a bunch of > different modules? > I don't think it needs to be configurable. There just needs to be a way to turn it off. > I actually went with this stop-at-N-collisions approach by patching my > CPython a few years ago, where I limiting dictobject and setobject's > critical `for` loop to 100 iterations (I realize this might handle > fewer than 100 collisions.) This worked fine until I tried to compile > PyPy, where the translator blew up due to a massive dict. I think that's because your collision-counting algorithm was much more primitive than MAL's. > This, > combined with the second problem (needing to catch an exception), led > me to abandon this approach and write Securetypes, which has a > securedict that uses SHA-1. Not that I like this either; I think I'm > happy with the randomize-hash() approach. > Why did you need to catch the exception? Were you not happy with the program simply terminating with a traceback when it got attacked? -- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com