[Python-Dev] Counting collisions for the win

2012-02-16 Thread Jim J. Jewett
In http://mail.python.org/pipermail/python-dev/2012-January/115715.html Frank Sievertsen wrote: Am 20.01.2012 13:08, schrieb Victor Stinner: I'm surprised we haven't seen bug reports about it from users of 64-bit Pythons long ago A Python dictionary only uses the lower bits of a hash value.

Re: [Python-Dev] Counting collisions for the win

2012-01-24 Thread Frank Sievertsen
Interesting idea, and I see it would avoid conversions. What happens if the data area also removed from the hash? So you enter 20 colliding keys, then 20 more that get randomized, then delete the 18 of the first 20. Can you still find the second 20 keys? Takes two sets of probes, somehow?

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Frank Sievertsen
Hello, I'd still prefer to see a randomized hash()-function (at least for 3.3). But to protect against the attacks it would be sufficient to use randomization for collision resolution in dicts (and sets). What if we use a second (randomized) hash-function in case there are many collisions in

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Glenn Linderman
On 1/23/2012 12:53 AM, Frank Sievertsen wrote: What if we use a second (randomized) hash-function in case there are many collisions in ONE lookup. This hash-function is used only for collision resolution and is not cached. So this sounds like SafeDict, but putting it under the covers and

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Frank Sievertsen
On 23.01.2012 19:25, Glenn Linderman wrote: So this sounds like SafeDict, but putting it under the covers and automatically converting from dict to SafeDict after a collision threshold has been reached. Let's call it fallback-dict. and costs: * converting the dict from one hash to the

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Glenn Linderman
On 1/23/2012 10:58 AM, Frank Sievertsen wrote: On 23.01.2012 19:25, Glenn Linderman wrote: So this sounds like SafeDict, but putting it under the covers and automatically converting from dict to SafeDict after a collision threshold has been reached. Let's call it fallback-dict. and costs:

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread M.-A. Lemburg
Frank Sievertsen wrote: Hello, I'd still prefer to see a randomized hash()-function (at least for 3.3). But to protect against the attacks it would be sufficient to use randomization for collision resolution in dicts (and sets). What if we use a second (randomized) hash-function in case

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Janzert
On 1/23/2012 1:25 PM, Glenn Linderman wrote: On 1/23/2012 12:53 AM, Frank Sievertsen wrote: What if we use a second (randomized) hash-function in case there are many collisions in ONE lookup. This hash-function is used only for collision resolution and is not cached. So this sounds like

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Victor Stinner
This seed is chosen randomly at runtime, but cannot change once chosen. The hash is used to compare objects: if hash(obj1) != hash(obj2), objects are considered different. So two strings must have the same hash if their value is the same. Salt could also be an appropriate term here, but since

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Lennart Regebro
On Sun, Jan 22, 2012 at 11:11, Victor Stinner victor.stin...@haypocalc.com wrote: This seed is chosen randomly at runtime, but cannot change once chosen. The hash is used to compare objects: if hash(obj1) != hash(obj2), objects are considered different. So two strings must have the same hash

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Antoine Pitrou
I think this thread is approaching the recursion limit. Be careful not to blow the stack :) Regards Antoine. On Sun, 22 Jan 2012 20:53:41 +0100 Lennart Regebro rege...@gmail.com wrote: On Sun, Jan 22, 2012 at 11:11, Victor Stinner victor.stin...@haypocalc.com wrote: This seed is chosen

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Paul McMillan
We may use a different salt per dictionary. If we're willing to re-hash everything on a per-dictionary basis. That doesn't seem reasonable given our existing usage. ___ Python-Dev mailing list Python-Dev@python.org

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Lennart Regebro
On Mon, Jan 23, 2012 at 06:02, Paul McMillan p...@mcmillan.ws wrote: We may use a different salt per dictionary. If we're willing to re-hash everything on a per-dictionary basis. That doesn't seem reasonable given our existing usage. Well, if we get crazy amounts of collisions, re-hashing

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Stephen J. Turnbull
Lennart Regebro writes: On Mon, Jan 23, 2012 at 06:02, Paul McMillan p...@mcmillan.ws wrote: We may use a different salt per dictionary. If we're willing to re-hash everything on a per-dictionary basis. That doesn't seem reasonable given our existing usage. Well, if we get crazy

Re: [Python-Dev] Counting collisions for the win

2012-01-22 Thread Tim Delaney
On 23 January 2012 16:49, Lennart Regebro rege...@gmail.com wrote: On Mon, Jan 23, 2012 at 06:02, Paul McMillan p...@mcmillan.ws wrote: We may use a different salt per dictionary. If we're willing to re-hash everything on a per-dictionary basis. That doesn't seem reasonable given our

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Matthew Woodcraft
Victor Stinner victor.stin...@haypocalc.com wrote: I propose to solve the hash collision vulnerability by counting collisions [...] We now know all issues of the randomized hash solution, and I think that there are more drawbacks than advantages. IMO the randomized hash is overkill to fix

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Guido van Rossum
On Sat, Jan 21, 2012 at 7:50 AM, Matthew Woodcraft matt...@woodcraft.me.uk wrote: Victor Stinner  victor.stin...@haypocalc.com wrote: I propose to solve the hash collision vulnerability by counting collisions [...] We now know all issues of the randomized hash solution, and I think that

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread David Malcolm
On Fri, 2012-01-20 at 16:55 +0100, Frank Sievertsen wrote: Hello, I still see at least two ways to create a DOS attack even with the collison-counting-patch. [snip description of two types of attack on the collision counting approach] What to do now? I think it's not smart to reduce the

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Jared Grubb
On 20 Jan 2012, at 10:49, Brett Cannon wrote: Why can't we have our cake and eat it too? Can we do hash randomization in 3.3 and use the hash count solution for bugfix releases? That way we get a basic fix into the bugfix releases that won't break people's code (hopefully) but we go with a

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Paul McMillan
On Sat, Jan 21, 2012 at 4:19 PM, Jared Grubb jared.gr...@gmail.com wrote: I agree; it sounds really odd to throw an exception since nothing is actually wrong and there's nothing the caller would do about it to recover anyway. Rather than throwing an exception, maybe you just reseed the random

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Steven D'Aprano
Paul McMillan wrote: On Sat, Jan 21, 2012 at 4:19 PM, Jared Grubb jared.gr...@gmail.com wrote: I agree; it sounds really odd to throw an exception since nothing is actually wrong and there's nothing the caller would do about it to recover anyway. Rather than throwing an exception, maybe you

Re: [Python-Dev] Counting collisions for the win

2012-01-21 Thread Paul McMillan
I may have a terminology problem here. I expect that a random seed must change every time it is used, otherwise the pseudorandom number generator using it just returns the same value each time. Should we be talking about a salt rather than a seed? You should read the several other threads,

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Martin v. Löwis
The last solution is very simple: count collision and raise an exception if it hits a limit. The path is something like 10 lines whereas the randomized hash is more close to 500 lines, add a new file, change Visual Studio project file, etc. First I thaught that it would break more

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
The main issue with that approach is that it allows a new kind of attack. Indeed, I posted another example: http://bugs.python.org/msg151677 This kind of fix can be used in a specific application or maybe in a special-purpose framework, but not on the level of a general-purpose language.

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Nick Coghlan
On Fri, Jan 20, 2012 at 7:34 PM, Martin v. Löwis mar...@v.loewis.de wrote: The main issue with that approach is that it allows a new kind of attack. An attacker now needs to find 1000 colliding keys, and submit them one-by-one into a database. The limit will not trigger, as those are just

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
2012/1/20 Ivan Kozik i...@ludios.org: I'd like to point out that an attacker is not limited to sending just one dict full of colliding keys.  Given a 22ms stall ... The presented attack produces a stall of at least 30 seconds (5 minutes or more if there is no time limit in the application), not

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
I'm surprised we haven't seen bug reports about it from users of 64-bit Pythons long ago A Python dictionary only uses the lower bits of a hash value. If your dictionary has less than 2**32 items, the dictionary order is exactly the same on 32 and 64 bits system: hash32(str) mask ==

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
No, that's not true. Whenever a collision happens, other bits are mixed in very fast. Frank Am 20.01.2012 13:08, schrieb Victor Stinner: I'm surprised we haven't seen bug reports about it from users of 64-bit Pythons long ago A Python dictionary only uses the lower bits of a hash value. If

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
2012/1/20 Frank Sievertsen fr...@sievertsen.de: No, that's not true. Whenever a collision happens, other bits are mixed in very fast. Oh, I didn't know that. So the dict order is only the same if there is no collision. Victor ___ Python-Dev mailing

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
The main issue with that approach is that it allows a new kind of attack. An attacker now needs to find 1000 colliding keys, and submit them one-by-one into a database. The limit will not trigger, as those are just database insertions. Now, if the applications also as a need to read the

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Barry Warsaw
On Jan 20, 2012, at 01:50 PM, Victor Stinner wrote: Counting collision doesn't solve this case, but it doesn't make the situation worse than before. Raising quickly an exception is better than stalling for minutes, even if I agree than it is not the best behaviour. ISTM that adding the

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Barry Warsaw
On Jan 20, 2012, at 03:18 PM, Nick Coghlan wrote: On Fri, Jan 20, 2012 at 2:54 PM, Carl Meyer c...@oddbird.net wrote: I don't have the expertise to speak otherwise to the alternatives for fixing the collisions vulnerability, but I don't believe it's accurate to presume that Django would not

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Barry Warsaw
On Jan 20, 2012, at 03:15 PM, Nick Coghlan wrote: With the 1000 collision limit in place, the attacker sends their massive request, the affected dict quickly hits the limit, throws an unhandled exception which is then caught by the web framework and turned into a 500 Error response (or whatever's

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Antoine Pitrou
On Fri, 20 Jan 2012 13:50:18 +0100 Victor Stinner victor.stin...@haypocalc.com wrote: The main issue with that approach is that it allows a new kind of attack. An attacker now needs to find 1000 colliding keys, and submit them one-by-one into a database. The limit will not trigger, as

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 1:34 AM, Martin v. Löwis mar...@v.loewis.dewrote: The last solution is very simple: count collision and raise an exception if it hits a limit. The path is something like 10 lines whereas the randomized hash is more close to 500 lines, add a new file, change Visual

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
Hello, I still see at least two ways to create a DOS attack even with the collison-counting-patch. I assumed that it's possible to send ~500KB of payload to the application. 1. It's fully deterministic which slots the dict will lookup. Since we don't count slot-collisions, but only

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
(I'm thinking that the original attack is trivial once the set of 65000 colliding keys is public knowledge, which must be only a matter of time.) I have a program able to generate collisions: it takes 1 second to compute 60,000 colliding strings on a desktop computer. So the security of the

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Victor Stinner
So I still think we should ditch the paranoia about dictionary order changing, and fix this without counting. The randomized hash has other issues: - its security is based on its secret, whereas it looks to be easy to compute it (see more details in the issue) - my patch only changes

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Antoine Pitrou
On Fri, 20 Jan 2012 17:17:24 +0100 Victor Stinner victor.stin...@haypocalc.com wrote: So I still think we should ditch the paranoia about dictionary order changing, and fix this without counting. The randomized hash has other issues: - its security is based on its secret, whereas it

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Lennart Regebro
On Fri, Jan 20, 2012 at 01:48, Victor Stinner victor.stin...@haypocalc.com wrote:  - the limit should be configurable: a new function in the sys module should be enough. It may be private (or replaced by an environment variable?) in stable versions I'd like to see both. I would like both the

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Thu, Jan 19, 2012 at 8:06 PM, Ivan Kozik i...@ludios.org wrote: No, I wasn't happy with termination. I wanted to treat it just like a JSON decoding error, and send the appropriate response. So was this attack actually being mounted on your service regularly? I'd think it would be

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 1:57 AM, Frank Sievertsen py...@sievertsen.dewrote: The main issue with that approach is that it allows a new kind of attack. Indeed, I posted another example: http://bugs.python.org/msg151677 This kind of fix can be used in a specific application or maybe in a

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 5:10 AM, Barry Warsaw ba...@python.org wrote: On Jan 20, 2012, at 01:50 PM, Victor Stinner wrote: Counting collision doesn't solve this case, but it doesn't make the situation worse than before. Raising quickly an exception is better than stalling for minutes, even if

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 5:20 AM, Barry Warsaw ba...@python.org wrote: Let's just be clear about it: this exception is new public API. Changing dictionary order is not. Not if you raise MemoryError or BaseException. -- --Guido van Rossum (python.org/~guido)

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
This is the first objection I have seen against collision-counting that might stand. On Fri, Jan 20, 2012 at 7:55 AM, Frank Sievertsen py...@sievertsen.dewrote: Hello, I still see at least two ways to create a DOS attack even with the collison-counting-patch. I assumed that it's possible

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Georg Brandl
Am 20.01.2012 19:15, schrieb Guido van Rossum: OTOH, if you change dictionary order and *that* breaks the application, then the bugs submitted to the application's tracker will be legitimate bugs that have to be fixed even if nothing else changed. There are lots of things

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Brett Cannon
On Fri, Jan 20, 2012 at 13:15, Guido van Rossum gu...@python.org wrote: On Fri, Jan 20, 2012 at 5:10 AM, Barry Warsaw ba...@python.org wrote: On Jan 20, 2012, at 01:50 PM, Victor Stinner wrote: Counting collision doesn't solve this case, but it doesn't make the situation worse than before.

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Terry Reedy
On 1/20/2012 11:17 AM, Victor Stinner wrote: There is no perfect solutions, drawbacks of each solution should be compared. Amen. One possible attack that has been described for a collision counting dict depends on knowing precisely the trigger point. So let MAXCOLLISIONS either be

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Donald Stufft
Even if a MemoryException is raised I believe that is still a fundamental change in the documented contract of dictionary API. I don't believe there is a way to fix this without breaking someones application. The major differences I see between the two solutions is that counting will break

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Case Van Horsen
On Fri, Jan 20, 2012 at 8:17 AM, Victor Stinner victor.stin...@haypocalc.com wrote: So I still think we should ditch the paranoia about dictionary order changing, and fix this without counting. The randomized hash has other issues:  - its security is based on its secret, whereas it looks

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Terry Reedy
On 1/20/2012 10:55 AM, Frank Sievertsen wrote: Hello, I still see at least two ways to create a DOS attack even with the collison-counting-patch. 2. The second attack actually attacks that 1000 allowed string comparisons are still a lot of work. First I added 999 strings that collide with a

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/20/2012 02:04 PM, Donald Stufft wrote: Even if a MemoryException is raised I believe that is still a fundamental change in the documented contract of dictionary API. How so? Dictionary inserts can *already* raise that error. I don't

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Donald Stufft
On Friday, January 20, 2012 at 2:36 PM, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/20/2012 02:04 PM, Donald Stufft wrote: Even if a MemoryException is raised I believe that is still a fundamental change in the documented contract of dictionary API.

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Ethan Furman
Donald Stufft wrote: Even if a MemoryException is raised I believe that is still a fundamental change in the documented contract of dictionary API. I don't believe there is a way to fix this without breaking someones application. The major differences I see between the two solutions is that

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 11:51 AM, Donald Stufft donald.stu...@gmail.comwrote: On Friday, January 20, 2012 at 2:36 PM, Tres Seaver wrote: On 01/20/2012 02:04 PM, Donald Stufft wrote: Even if a MemoryException is raised I believe that is still a fundamental change in the documented contract

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Terry Reedy
On 1/20/2012 2:51 PM, Donald Stufft wrote: I think the counting collision is at best a bandaid and not a proper fix stemmed from a desire to not break existing applications on a bugfix release ... My opinion of counting is better than yours, but even conceding the theoretical, purity

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Ben Wolfson
On Fri, Jan 20, 2012 at 2:11 PM, Terry Reedy tjre...@udel.edu wrote: On 1/20/2012 2:51 PM, Donald Stufft wrote: I think the counting collision is at best a bandaid and not a proper fix stemmed from a desire to not break existing applications on a bugfix release ... My opinion of counting is

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Donald Stufft
I believe that either solution has the potential to break existing applications so to ensure that no applications are broken there will need to be a method of disabling it. I also believe that to maintain the backwards compatibility that Python has traditionally had in bug fix releases that

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
Am 20.01.2012 16:33, schrieb Guido van Rossum: (I'm thinking that the original attack is trivial once the set of 65000 colliding keys is public knowledge, which must be only a matter of time. I think it's very likely that this will happen soon. For ASP and PHP there is attack-payload

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 2:33 PM, Ben Wolfson wolf...@gmail.com wrote: On Fri, Jan 20, 2012 at 2:11 PM, Terry Reedy tjre...@udel.edu wrote: On 1/20/2012 2:51 PM, Donald Stufft wrote: I think the counting collision is at best a bandaid and not a proper fix stemmed from a desire to not break

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 2:35 PM, Frank Sievertsen py...@sievertsen.de wrote: Am 20.01.2012 16:33, schrieb Guido van Rossum: (I'm thinking that the original attack is trivial once the set of 65000 colliding keys is public knowledge, which must be only a matter of time. I think it's very

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Steven D'Aprano
Guido van Rossum wrote: On Fri, Jan 20, 2012 at 11:51 AM, Donald Stufft donald.stu...@gmail.comwrote: On Friday, January 20, 2012 at 2:36 PM, Tres Seaver wrote: On 01/20/2012 02:04 PM, Donald Stufft wrote: Even if a MemoryException is raised I believe that is still a fundamental change in

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
It should derive from BaseException. --Guido van Rossum (sent from Android phone) On Jan 20, 2012 4:59 PM, Steven Dapos;Aprano st...@pearwood.info wrote: Guido van Rossum wrote: On Fri, Jan 20, 2012 at 11:51 AM, Donald Stufft donald.stu...@gmail.com **wrote: On Friday, January 20, 2012

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Steven D'Aprano
Terry Reedy wrote: On 1/20/2012 11:17 AM, Victor Stinner wrote: There is no perfect solutions, drawbacks of each solution should be compared. Amen. One possible attack that has been described for a collision counting dict depends on knowing precisely the trigger point. So let

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Steven D'Aprano
Guido van Rossum wrote: It should derive from BaseException. RuntimeError meets that requirement, and it is an existing exception so there are no issues with introducing a new built-in exception to a point release. py issubclass(RuntimeError, BaseException) True -- Steven

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Benjamin Peterson
2012/1/20 Steven D'Aprano st...@pearwood.info: Guido van Rossum wrote: It should derive from BaseException. RuntimeError meets that requirement, and it is an existing exception so there are no issues with introducing a new built-in exception to a point release. py

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Guido van Rossum
On Fri, Jan 20, 2012 at 6:33 PM, Steven D'Aprano st...@pearwood.info wrote: Guido van Rossum wrote: It should derive from BaseException. RuntimeError meets that requirement, and it is an existing exception so there are no issues with introducing a new built-in exception to a point release.

[Python-Dev] Counting collisions for the win

2012-01-19 Thread Victor Stinner
Hi, I'm working on the hash collision issue since 2 or 3 weeks. I evaluated all solutions and I think that I have now a good knowledge of the problem and how it should be solved. The major issue is to have a minor or no impact on applications (don't break backward compatibility). I saw three

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Guido van Rossum
On Thu, Jan 19, 2012 at 4:48 PM, Victor Stinner victor.stin...@haypocalc.com wrote: Hi, I'm working on the hash collision issue since 2 or 3 weeks. I evaluated all solutions and I think that I have now a good knowledge of the problem and how it should be solved. The major issue is to have

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Ivan Kozik
On Fri, Jan 20, 2012 at 00:48, Victor Stinner victor.stin...@haypocalc.com wrote: I propose to solve the hash collision vulnerability by counting collisions because it does fix the vulnerability with a minor or no impact on applications or backward compatibility. I don't see why we should use

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Guido van Rossum
On Thu, Jan 19, 2012 at 7:32 PM, Ivan Kozik i...@ludios.org wrote: On Fri, Jan 20, 2012 at 00:48, Victor Stinner victor.stin...@haypocalc.com wrote: I propose to solve the hash collision vulnerability by counting collisions because it does fix the vulnerability with a minor or no impact

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Steven D'Aprano
Victor Stinner wrote: The last solution is very simple: count collision and raise an exception if it hits a limit. ... According to my basic tests, a limit of 35 collisions requires a dictionary with more than 10,000,000 integer keys to raise an error. I am not talking about the attack, but

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Ivan Kozik
On Fri, Jan 20, 2012 at 03:48, Guido van Rossum gu...@python.org wrote: I think that's because your collision-counting algorithm was much more primitive than MAL's. Conceded. This, combined with the second problem (needing to catch an exception), led me to abandon this approach and write

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Carl Meyer
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Victor, On 01/19/2012 05:48 PM, Victor Stinner wrote: [snip] Using a randomized hash may also break (indirectly) real applications because the application output is also somehow randomized. For example, in the Django test suite, the HTML

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Nick Coghlan
On Fri, Jan 20, 2012 at 2:00 PM, Steven D'Aprano st...@pearwood.info wrote: With a limit of 35 collisions, it only takes 35 keys to to force a dict to raise an exception, if you are an attacker able to select colliding keys. We're trying to defend against an attacker who is able to force

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Nick Coghlan
On Fri, Jan 20, 2012 at 2:54 PM, Carl Meyer c...@oddbird.net wrote: I don't have the expertise to speak otherwise to the alternatives for fixing the collisions vulnerability, but I don't believe it's accurate to presume that Django would not want to fix a dict-ordering dependency, and use that

Re: [Python-Dev] Counting collisions for the win

2012-01-19 Thread Glenn Linderman
On 1/19/2012 8:54 PM, Carl Meyer wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Victor, On 01/19/2012 05:48 PM, Victor Stinner wrote: [snip] Using a randomized hash may also break (indirectly) real applications because the application output is also somehow randomized. For example,