(Well, technically, you could use trees or some other O log n data
structure as a fallback once you have too many collisions, for some value
of too many. Seems a bit wasteful for the purpose, though.)
I don't think that would be wasteful. You wouldn't just use the tree for
the case of too many
On Sat, 31 Dec 2011 16:56:00 -0700
Guido van Rossum gu...@python.org wrote:
ISTM the only reasonable thing is to have a random seed picked very early
in the process, to be used to change the hash() function of
str/bytes/unicode (in a way that they are still compatible with each other).
Do str
On Sat, Dec 31, 2011 at 9:11 PM, Antoine Pitrou solip...@pitrou.net wrote:
On Sat, 31 Dec 2011 16:56:00 -0700
Guido van Rossum gu...@python.org wrote:
ISTM the only reasonable thing is to have a random seed picked very early
in the process, to be used to change the hash() function of
On Sat, Dec 31, 2011 at 8:29 PM, Paul McMillan p...@mcmillan.ws wrote:
I'm not too concerned about a 3rd party being able to guess the random
seed
-- this would require much more effort on their part, since they would
have
to generate a new set of colliding keys each time they think they
Still, it would represent an effort for the attacker of a much greater
magnitude than the current attack. It's all a trade-off -- at some point
it'll just be easier for the attacker to use some other vulnerability. Also
the class of vulnerable servers would be greatly reduced.
I agree that
In http://mail.python.org/pipermail/python-dev/2011-December/115138.html,
Christian Heimes
pointed out that
... we don't have to alter the outcome of hash ... We just need to reduce the
chance that
an attacker can produce collisions in the dict (and set?)
I'll state it more strongly. hash
Le 29/12/2011 02:28, Michael Foord a écrit :
A paper (well, presentation) has been published highlighting security problems
with the hashing algorithm (exploiting collisions) in many programming
languages Python included:
Jim Jewett wrote:
My personal opinion is that accepting *and parsing* enough data for
this to be a problem
is enough of an edge case that I don't want normal dicts slowed down
at all for this; I would
therefore prefer that the change be restricted to such a compile-time
switch, with current
Le 29/12/2011 14:19, Christian Heimes a écrit :
Perhaps the dict code is a better place for randomization.
The problem is the creation of a dict with keys all having the same hash
value. The current implementation of dict uses a linked-list. Adding a
new item requires to compare the new key
In case the watchdog is not a viable solution as I had assumed it was, I
think it's more reasonable to indeed consider adding a flag to Python
that allows randomization of hashes optionally before startup.
A flag will only be needed if the overhead of the fix is too high.
However as it was
Am 31.12.2011 03:31, schrieb Victor Stinner:
Le 29/12/2011 14:19, Christian Heimes a écrit :
Perhaps the dict code is a better place for randomization.
The problem is the creation of a dict with keys all having the same hash
value. The current implementation of dict uses a linked-list.
Am 31.12.2011 03:19, schrieb Steven D'Aprano:
How about using a similar strategy to the current dict behaviour with
__missing__ and defaultdict? Here's my suggestion:
- If a dict subclass defines __salt__, then it is called to salt the hash
value before lookups. If __salt__ is
On 12/30/2011 8:04 PM, Jim Jewett wrote:
I'll state it more strongly. hash probably should not change (at
least for this),
I agree, especially since the vulnerability can be avoided by using 64
bit servers and will generally abate as more switch anyway.
but we may
want to consider a
On Thu, Dec 29, 2011 at 8:19 AM, Christian Heimes li...@cheimes.de wrote:
Persistence layers like ZODB and cross interpreter communication
channels used by multiprocessing may (!) rely on the fact that the hash
of a string is fixed.
ZODB does not rely on a fixed hash function for strings; for
Hi,
Just some extra thoughts about the whole topic in the light of web
applications (since this was hinted in the talk) running on Python:
Yes, you can limit the number of maximum allowed parameters for post
data but really there are so many places where data is parsed into
hashing
Hi,
Something I should add to this now that I thought about it a bit more:
Assuming this should be fixed on a language level the solution would
probably be to salt hashes. The most common hash to salt here is the
PyUnicode hash for obvious reasons.
- Option a: Compiled in Salt
+ Easy to
On 12/28/2011 9:09 PM, Raymond Hettinger wrote:
Also, randomizing the hash wreaks havoc on doctests, book examples
not matching actual dict reprs, and on efforts by users to optimize
the insertion order into dicts with frequent lookups.
I don't have a strong opinion about what to do about this
Michael Foord wrote:
Hello all,
A paper (well, presentation) has been published highlighting security problems
with the hashing algorithm (exploiting collisions) in many programming
languages Python included:
On Wed, Dec 28, 2011 at 8:49 PM, Eric Snow ericsnowcurren...@gmail.com wrote:
On Wed, Dec 28, 2011 at 6:28 PM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing algorithm (exploiting
On Thu, 29 Dec 2011 03:55:22 +0100
Christian Heimes li...@cheimes.de wrote:
I've been dealing with web stuff and security for almost a decade. I've
seen far worse attack vectors. This one can easily be solved with a
couple of lines of Python code. For example Application developers can
limit
On Thu, 29 Dec 2011 04:04:17 +0100
Christian Heimes li...@cheimes.de wrote:
Am 29.12.2011 02:37, schrieb Jesse Noller:
Back up link for the PDF:
http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
Ocert disclosure:
Raymond Hettinger wrote:
FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
It is believed that they give us better-than-random results for commonly
encountered datasets. A change to randomized hashes would have a
negative performance impact on those cases.
Tim Peter's
On Thu, 29 Dec 2011 14:32:21 +0100
Christian Heimes li...@cheimes.de wrote:
Am 29.12.2011 13:57, schrieb Armin Ronacher:
Hi,
Something I should add to this now that I thought about it a bit more:
Assuming this should be fixed on a language level the solution would
probably be to
Hi,
how about
Option d: Host based salt
+ Easy-ish to implement – how about basing it on the hostname for example?
+ transparent for all processes on the same host
- probably unit test breakage
In fact, we could use host based as default with the option to specify own
which would solve
On Thu, 29 Dec 2011 11:25:03 +
Mark Shannon m...@hotpy.org wrote:
Also, randomizing the hash wreaks havoc on doctests, book examples
not matching actual dict reprs, and on efforts by users to optimize
the insertion order into dicts with frequent lookups.
The docs clearly state that
Am 29.12.2011 12:13, schrieb Mark Shannon:
The attack relies on being able to predict the hash value for a given
string. Randomising the string hash function is quite straightforward.
There is no need to change the dictionary code.
A possible (*untested*) patch is attached. I'll leave it for
Am 29.12.2011 12:10, schrieb Antoine Pitrou:
I've been dealing with web stuff and security for almost a decade. I've
seen far worse attack vectors. This one can easily be solved with a
couple of lines of Python code. For example Application developers can
limit the maximum amount of POST
Mark Shannon wrote:
Michael Foord wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing
algorithm (exploiting collisions) in many programming languages Python
included:
On Thu, 29 Dec 2011 13:57:07 +0100
Armin Ronacher armin.ronac...@active-4.com wrote:
- Option c: Random salt at runtime
+ Easy to implement
- impossible to synchronize
- breaks unittests in the same way as a compiled in salt would do
This option would have my preference. I don't think
Am 29.12.2011 11:32, schrieb Antoine Pitrou:
If I remember correctly CPython uses the long for its hash function so
64bit Windows uses a 32bit hash.
Not anymore, Py_hash_t is currently aligned with Py_ssize_t.
Thanks for the update. Python 2.x still uses long and several large
frameworks
Am 29.12.2011 13:57, schrieb Armin Ronacher:
Hi,
Something I should add to this now that I thought about it a bit more:
Assuming this should be fixed on a language level the solution would
probably be to salt hashes. The most common hash to salt here is the
PyUnicode hash for obvious
+1 to option d (host-based salt) but would need to consistently order the
hostnames/addresses to guarantee that all processes on the same machine got
the same salt by default.
+1 to option c (environment variable) as an override. And/or maybe an
override on the command line.
+1 to implementing
+1 to option c (environment variable) as an override. And/or maybe an
override on the command line.
That obviously should have said option b (environment variable) ...
Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
On Thu, Dec 29, 2011 at 8:32 AM, Christian Heimes li...@cheimes.de wrote:
IMHO we don't have to alter the outcome of hash(some string), hash(1)
and all other related types. We just need to reduce the change the an
attacker can produce collisions in the dict (and set?) code that looks
up the
It's worth pointing out that if the salt is somehow exposed to an
attacker, or is guessable, much of the benefit goes away. It's likely
that a timing attack could be used to discover the salt if it is fixed
per machine or process over a long period of time.
If a salt is generally fixed per
Am 29.12.2011 21:07, schrieb PJ Eby:
I don't understand how that helps a collision attack. If you can still
generate two strings with the same (pre-randomized) hash, what
difference does it make that the dict adds a random number? The
post-randomized number will still be the same, no?
Or
The talk was a presentation yesterday by Alexander Klink and Julian
Wälde at the Chaos Communication Congress in Germany hash...@alech.de
I read the non-technical summary at
http://arstechnica.com/business/news/2011/12/huge-portions-of-web-vulnerable-to-hashing-denial-of-service-attack.ars
On 12/29/2011 4:31 PM, Christian Heimes wrote:
The hash randomization idea adds a salt to throw the attacker of course.
Instead of
position = hash mask
it's now
hash = salt + hash
As I understood the talk (actually, the bit of Perl interpreter C code
shown), the randomization is to
Am 29.12.2011 23:28, schrieb Terry Reedy:
As I understood the talk (actually, the bit of Perl interpreter C code
shown), the randomization is to change hash(s) to hash(salt+s) so that
the salt is completely mixed into the hash from the beginning, rather
than just tacked on at the end.
Yes,
On 30 December 2011 06:59, Tim Delaney timothy.c.dela...@gmail.com wrote:
+0 to exposing the salt as a constant (3.3+ only) - or alternatively
expose a hash function that just takes an existing hash and returns the
salted hash. That would make it very easy for anything that wanted a salted
Hello all,
A paper (well, presentation) has been published highlighting security problems
with the hashing algorithm (exploiting collisions) in many programming
languages Python included:
On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing algorithm (exploiting collisions) in many
programming languages Python included:
On Wednesday, December 28, 2011 at 8:37 PM, Jesse Noller wrote:
On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing algorithm (exploiting collisions) in
On Wed, Dec 28, 2011 at 6:28 PM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
Hello all,
A paper (well, presentation) has been published highlighting security
problems with the hashing algorithm (exploiting collisions) in many
programming languages Python included:
A few thoughts on this:
a) This is not a new issue, I'm curious what the new interest is in it.
b) Whatever the solution to this is, it is *not* CPython specific, any decision
should be reflected in the Python language spec IMO, if CPython has the semantic
that dicts aren't vulnerable to hash
FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
It is believed that they give us better-than-random results for commonly
encountered datasets. A change to randomized hashes would have a
negative performance impact on those cases.
Also, randomizing the hash wreaks havoc on
Am 29.12.2011 02:37, schrieb Jesse Noller:
Back up link for the PDF:
http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
Ocert disclosure:
http://www.ocert.org/advisories/ocert-2011-003.html
From http://www.nruns.com/_downloads/advisory28122011.pdf
---
Am 29.12.2011 03:09, schrieb Raymond Hettinger:
FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
It is believed that they give us better-than-random results for commonly
encountered datasets. A change to randomized hashes would have a
negative performance impact on those
On Wed, Dec 28, 2011 at 19:51, Alex Gaynor alex.gay...@gmail.com wrote:
A few thoughts on this:
a) This is not a new issue, I'm curious what the new interest is in it.
Well they (the presenters of the report) had to be accepted to that
conference for *something*, otherwise we wouldn't know
101 - 149 of 149 matches
Mail list logo