2012/1/18 Martin v. Löwis mar...@v.loewis.de:
For 3.3 onwards, I'm skeptical whether all this configuration support is
really necessary. I think a much smaller patch which leaves no choice
would be more appropriate.
The configuration helps unit testing: see changes on Lib/test/*.py in
my last
On 01/17/2012 09:29 PM, Martin v. Löwis wrote:
I(0) = H MASK
PERTURB(0) = H
I(n+1) = (5*I(n) + 1 + PERTURB(n)) MASK
PERTURN(n+1) = PERTURB(n) 5
So if two objects O1 and O2 have the same hash value H, the sequence of
probed indices is the same for any MASK value. It will be a
It doesn't change anything, you will still get collisions.
That depends right? If the collision is because they all have the same
hash(), yes. It might be different if it is because the secondary hashing
(or whatever it's called :-) causes collisions.
But Python deals with the latter case
mar...@v.loewis.de writes:
It doesn't change anything, you will still get collisions.
That depends right? If the collision is because they all have the same
hash(), yes. It might be different if it is because the secondary hashing
(or whatever it's called :-) causes collisions.
I thought that the original problem was that with N insertions in the
dictionary, by repeatedly inserting different keys generating the same
hash value an attacker could arrange that the cost of finding an open
slot is O(N), and thus the cost of N insertions is O(N^2).
If so, frequent
I finished my patch transforming hash(str) to a randomized hash
function, see random-8.patch attached to the issue:
http://bugs.python.org/issue13703
The remaining question is which random number generator should be used
on Windows to initialize the hash secret (CryptoGen adds an overhead
of 10%,
Victor Stinner wrote:
If hash(str1)DICT_MASK == hash(str2)DICT_MASK but
hash(str1)!=hash(str2), strings are not compared (this is a common
optimization in Python), and the so the attack would not be successful
(it would be slow, but not as slow as comparing two strings).
It's a shame the
Jeremy Sanders wrote:
Victor Stinner wrote:
If hash(str1)DICT_MASK == hash(str2)DICT_MASK but
hash(str1)!=hash(str2), strings are not compared (this is a common
optimization in Python), and the so the attack would not be successful
(it would be slow, but not as slow as comparing two
I thought that the original problem was that with N insertions in the
dictionary, by repeatedly inserting different keys generating the same
hash value an attacker could arrange that the cost of finding an open
slot is O(N), and thus the cost of N insertions is O(N^2).
If so, frequent
I plan to commit my fix to Python 3.3 if it is accepted. Then write a
simplified version to Python 3.2 and backport it to 3.1.
I'm opposed to any change to the hash values of strings in maintenance
releases, so I guess I'm opposed to your patch in principle.
See my next message for an
I plan to commit my fix to Python 3.3 if it is accepted. Then write a
simplified version to Python 3.2 and backport it to 3.1.
I'm opposed to any change to the hash values of strings in maintenance
releases, so I guess I'm opposed to your patch in principle.
If randomized hash cannot be
If randomized hash cannot be turned on by default, an alternative is
to switch them off by default, and add an option (command line option,
environment variable, etc.) to enable it.
That won't really fix the problem. If people install a new release because
it fixes a vulnerability, it better
On Tue, Jan 17, 2012 at 12:52 PM, Martin v. Löwis mar...@v.loewis.dewrote:
I plan to commit my fix to Python 3.3 if it is accepted. Then write a
simplified version to Python 3.2 and backport it to 3.1.
I'm opposed to any change to the hash values of strings in maintenance
releases, so I
Am 18.01.2012 07:06, schrieb Gregory P. Smith:
On Tue, Jan 17, 2012 at 12:52 PM, Martin v. Löwis mar...@v.loewis.de
mailto:mar...@v.loewis.de wrote:
I plan to commit my fix to Python 3.3 if it is accepted. Then write a
simplified version to Python 3.2 and backport it to 3.1.
On Sun, Jan 15, 2012 at 9:44 AM, Guido van Rossum gu...@python.org wrote:
On Sun, Jan 15, 2012 at 8:46 AM, Stefan Behnel stefan...@behnel.dewrote:
Guido van Rossum, 15.01.2012 17:10:
On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel wrote:
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58
As I understand it, the way the attack works is that a *single*
malicious request from the attacker can DoS the server by eating CPU
resources while evaluating a massive collision chain induced in a dict
by attacker supplied data. Explicitly truncating the collision chain
boots them out
On 17 January 2012 09:23, Paul McMillan p...@mcmillan.ws wrote:
This is why the simply throw an error solution isn't a complete fix.
Making portions of an interface unusable for regular users is clearly
a bad thing, and is clearly applicable to other types of poisoned data
as well. We need to
On 17 January 2012 10:14, Tim Delaney timothy.c.dela...@gmail.com wrote:
On 17 January 2012 09:23, Paul McMillan p...@mcmillan.ws wrote:
This is why the simply throw an error solution isn't a complete fix.
Making portions of an interface unusable for regular users is clearly
a bad thing, and
2012/1/17 Tim Delaney timothy.c.dela...@gmail.com:
What if in a pathological collision (e.g. 1000 collisions), we increased
the size of a dict by a small but random amount?
It doesn't change anything, you will still get collisions.
Victor
___
On Mon, Jan 16, 2012 at 4:16 PM, Victor Stinner
victor.stin...@haypocalc.com wrote:
2012/1/17 Tim Delaney timothy.c.dela...@gmail.com:
What if in a pathological collision (e.g. 1000 collisions), we
increased
the size of a dict by a small but random amount?
It doesn't change anything,
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users who had anything depending
on ordering of internal hash tables. Their code was already broken.
Given that the doc says Return the hash value of the object, I do not
On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel stefan...@behnel.de wrote:
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users who had anything depending
on ordering of internal hash tables. Their code was already
Guido van Rossum, 15.01.2012 17:10:
On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel wrote:
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users who had anything depending
on ordering of internal hash tables. Their code was
On Sun, Jan 15, 2012 at 8:46 AM, Stefan Behnel stefan...@behnel.de wrote:
It also seems to me that the wording has a hash value which never changes
during its lifetime makes it pretty clear that the lifetime of the hash
value is not guaranteed to supersede the lifetime of the object (although
On Sun, 15 Jan 2012 17:46:36 +0100
Stefan Behnel stefan...@behnel.de wrote:
Guido van Rossum, 15.01.2012 17:10:
On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel wrote:
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users
On Sun, Jan 15, 2012 at 8:46 AM, Stefan Behnel stefan...@behnel.de wrote:
Guido van Rossum, 15.01.2012 17:10:
On Sun, Jan 15, 2012 at 6:30 AM, Stefan Behnel wrote:
Terry Reedy, 14.01.2012 06:43:
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users
Jack Diederich writes:
On Thu, Jan 12, 2012 at 9:57 PM, Guido van Rossum gu...@python.org wrote:
Hm... I started out as a big fan of the randomized hash, but thinking more
about it, I actually believe that the chances of some legitimate app having
1000 collisions are way smaller than the
On Sat, 14 Jan 2012 04:45:57 +0100
mar...@v.loewis.de wrote:
What an implementation looks like:
http://pastebin.com/9ydETTag
some stuff to be filled in, but this is all that is really required.
I think this statement (and the patch) is wrong. You also need to change
the byte string
On Sat, 14 Jan 2012 13:55:22 +1100
Steven D'Aprano st...@pearwood.info wrote:
On 14/01/12 12:58, Gregory P. Smith wrote:
I do like *randomly seeding the hash*. *+1*. This is easy. It can easily be
back ported to any Python version.
It is perfectly okay to break existing users who had
I think this statement (and the patch) is wrong. You also need to change
the byte string hashing, at least for 2.x. This I consider the biggest
flaw in that approach - other people may have written string-like objects
which continue to compare equal to a string but now hash different.
They're
Am 13.01.2012 18:08, schrieb Mark Dickinson:
On Fri, Jan 13, 2012 at 2:57 AM, Guido van Rossum gu...@python.org wrote:
How
pathological the data needs to be before the collision counter triggers? I'd
expect *very* pathological.
How pathological do you consider the set
{1 n for n in
Am 14.01.2012 01:37, schrieb Benjamin Peterson:
2012/1/13 Guido van Rossum gu...@python.org:
Really? Even though you came up with specifically to prove me wrong?
Coming up with a counterexample now invalidates it?
There are two concerns here:
- is it possible to come up with an example of
My patch example does change the bytes object hash as well as Unicode.
On Jan 13, 2012 7:46 PM, mar...@v.loewis.de wrote:
What an implementation looks like:
http://pastebin.com/9ydETTag
some stuff to be filled in, but this is all that is really required.
I think this statement (and the
FWIW the quick change i pastebin'ed is basically covered by the change
already under review in http://bugs.python.org/review/13704/show. I've
made my comments and suggestions there.
I looked into Modules/expat/xmlparse.c and it has an odd copy of the old
string hash algorithm entirely for its
Victor Stinner wrote:
- Marc Andre Lemburg proposes to fix the vulnerability directly in
dict (for any key type). The patch raises an exception if a lookup
causes more than 1000 collisions.
Am I missing something? How does this fix the vulnerability? It seems to me
that the only thing this
Guido van Rossum wrote:
On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith g...@krypto.org wrote:
It is perfectly okay to break existing users who had anything depending on
ordering of internal hash tables. Their code was already broken. We
*will*provide a flag and/or environment variable that
On Sun, Jan 15, 2012 at 2:42 PM, Steven D'Aprano st...@pearwood.info wrote:
Victor Stinner wrote:
- Marc Andre Lemburg proposes to fix the vulnerability directly in
dict (for any key type). The patch raises an exception if a lookup
causes more than 1000 collisions.
Am I missing something?
Am 13.01.2012 02:24, schrieb Victor Stinner:
My patch doesn't fix the DoS, it just make the attack more complex.
The attacker cannot pregenerate data for an attack: (s)he has first to
compute the hash secret, and then compute hash collisions using the
secret. The hash secret is a least 64 bits
Unfortunately it requires only a few seconds to compute enough 32bit
collisions on one core with no precomputed data.
Are you running the hash function backward to generate strings with
the same value, or you are more trying something like brute forcing?
And how do you get the hash secret? You
On Fri, Jan 13, 2012 at 02:24, Victor Stinner
victor.stin...@haypocalc.com wrote:
- Glenn Linderman proposes to fix the vulnerability by adding a new
safe dict type (only accepting string keys). His proof-of-concept
(SafeDict.py) uses a secret of 64 random bits and uses it to compute
the hash
Unfortunately it requires only a few seconds to compute enough 32bit
collisions on one core with no precomputed data.
Are you running the hash function backward to generate strings with
the same value, or you are more trying something like brute forcing?
If you try it brute force to hit a
On 2012-01-13 11:20, Lennart Regebro wrote:
The vulnerability is basically only in the dictionary you keep the
form data you get from a request.
I'd have to disagree with this statement. The vulnerability is anywhere
that creates a dictionary (or set) from attacker-provided keys. That
would
On Fri, Jan 13, 2012 at 2:57 AM, Guido van Rossum gu...@python.org wrote:
How
pathological the data needs to be before the collision counter triggers? I'd
expect *very* pathological.
How pathological do you consider the set
{1 n for n in range(2000)}
to be? What about the set:
On Fri, Jan 13, 2012 at 9:08 AM, Mark Dickinson dicki...@gmail.com wrote:
On Fri, Jan 13, 2012 at 2:57 AM, Guido van Rossum gu...@python.org
wrote:
How
pathological the data needs to be before the collision counter triggers?
I'd
expect *very* pathological.
How pathological do you
On Fri, Jan 13, 2012 at 5:43 PM, Guido van Rossum gu...@python.org wrote:
How pathological do you consider the set
{1 n for n in range(2000)}
to be? What about the set:
ieee754_powers_of_two = {2.0**n for n in range(-1074, 1024)}
? The 2000 elements of the latter set have only 61
On Fri, Jan 13, 2012 at 10:13 AM, Mark Dickinson dicki...@gmail.com wrote:
On Fri, Jan 13, 2012 at 5:43 PM, Guido van Rossum gu...@python.org
wrote:
How pathological do you consider the set
{1 n for n in range(2000)}
to be? What about the set:
ieee754_powers_of_two = {2.0**n
2012/1/13 Guido van Rossum gu...@python.org:
Really? Even though you came up with specifically to prove me wrong?
Coming up with a counterexample now invalidates it?
--
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
On Thu, 12 Jan 2012 18:57:42 -0800
Guido van Rossum gu...@python.org wrote:
Hm... I started out as a big fan of the randomized hash, but thinking more
about it, I actually believe that the chances of some legitimate app having
1000 collisions are way smaller than the chances that somebody's
- Glenn Linderman proposes to fix the vulnerability by adding a new
safe dict type (only accepting string keys). His proof-of-concept
(SafeDict.py) uses a secret of 64 random bits and uses it to compute
the hash of a key.
We could mix Marc's collision counter with SafeDict idea (being able
to
On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou solip...@pitrou.net wrote:
On Thu, 12 Jan 2012 18:57:42 -0800
Guido van Rossum gu...@python.org wrote:
Hm... I started out as a big fan of the randomized hash, but thinking
more
about it, I actually believe that the chances of some legitimate
On Fri, Jan 13, 2012 at 5:38 PM, Guido van Rossum gu...@python.org wrote:
On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou solip...@pitrou.netwrote:
On Thu, 12 Jan 2012 18:57:42 -0800
Guido van Rossum gu...@python.org wrote:
Hm... I started out as a big fan of the randomized hash, but
On 1/13/2012 5:35 PM, Victor Stinner wrote:
- Glenn Linderman proposes to fix the vulnerability by adding a new
safe dict type (only accepting string keys). His proof-of-concept
(SafeDict.py) uses a secret of 64 random bits and uses it to compute
the hash of a key.
We could mix Marc's collision
Clearly these ideas are more complex than adding randomization, but adding
randomization doesn't seem to be produce immunity from attack, when data
about the randomness is leaked.
Which will not normally happen.
I'm firmly in the camp that believes the random seed can be probed and
btw, Tim's commit message on this one is amusingly relevant. :)
http://hg.python.org/cpython/diff/8d2f2cb9/Objects/dictobject.c
On Fri, Jan 13, 2012 at 6:25 PM, Gregory P. Smith g...@krypto.org wrote:
Clearly these ideas are more complex than adding randomization, but
adding
On 14/01/12 12:58, Gregory P. Smith wrote:
I do like *randomly seeding the hash*. *+1*. This is easy. It can easily be
back ported to any Python version.
It is perfectly okay to break existing users who had anything depending on
ordering of internal hash tables. Their code was already broken.
On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith g...@krypto.org wrote:
On Fri, Jan 13, 2012 at 5:38 PM, Guido van Rossum gu...@python.orgwrote:
On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou solip...@pitrou.netwrote:
On Thu, 12 Jan 2012 18:57:42 -0800
Guido van Rossum gu...@python.org
What an implementation looks like:
http://pastebin.com/9ydETTag
some stuff to be filled in, but this is all that is really required.
I think this statement (and the patch) is wrong. You also need to change
the byte string hashing, at least for 2.x. This I consider the biggest
flaw in that
On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith g...@krypto.org wrote:
It is perfectly okay to break existing users who had anything depending on
ordering of internal hash tables. Their code was already broken. We
*will*provide a flag and/or environment variable that can be set to turn the
On 1/13/2012 8:58 PM, Gregory P. Smith wrote:
It is perfectly okay to break existing users who had anything depending
on ordering of internal hash tables. Their code was already broken.
Given that the doc says Return the hash value of the object, I do not
think we should be so hard-nosed.
On Thu, Jan 12, 2012 at 9:57 PM, Guido van Rossum gu...@python.org wrote:
Hm... I started out as a big fan of the randomized hash, but thinking more
about it, I actually believe that the chances of some legitimate app having
1000 collisions are way smaller than the chances that somebody's code
On Sat, Jan 14, 2012 at 4:24 PM, Jack Diederich jackd...@gmail.com wrote:
This is depending on how the counting is done (I didn't look at MAL's
patch), and assuming that increasing the hash table size will generally
reduce collisions if items collide but their hashes are different.
The patch
Many people proposed their own idea to fix the vulnerability, but only
3 wrote a patch:
- Glenn Linderman proposes to fix the vulnerability by adding a new
safe dict type (only accepting string keys). His proof-of-concept
(SafeDict.py) uses a secret of 64 random bits and uses it to compute
the
Hm... I started out as a big fan of the randomized hash, but thinking more
about it, I actually believe that the chances of some legitimate app having
1000 collisions are way smaller than the chances that somebody's code will
break due to the variable hashing. In fact we know for a fact that the
63 matches
Mail list logo