Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Stefan Behnel
Jim Jewett, 08.01.2012 23:33: Stefan Behnel wrote: Admittedly, this may require some adaptation for the PEP393 unicode memory layout in order to produce identical hashes for all three representations if they represent the same content. They SHOULD NOT represent the same content; comparing

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Victor Stinner
That said, I don't think smallest-format is actually enforced with anything stronger than comments (such as in unicodeobject.h struct PyASCIIObject) and asserts (mostly calling _PyUnicode_CheckConsistency).  I don't have any insight on how prevalent non-conforming strings will be in practice,

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-08 Thread Brian Curtin
On Sun, Jan 8, 2012 at 16:33, Jim Jewett jimjjew...@gmail.com wrote: In http://mail.python.org/pipermail/python-dev/2012-January/115368.html Stefan Behnel wrote: Can you please configure your mail client to not create new threads like this? As if this topic wasn't already hard enough to follow,

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Stefan Behnel
Christian Heimes, 31.12.2011 04:59: Am 31.12.2011 03:22, schrieb Victor Stinner: The unique structure of CPython's dict implementation makes it harder to get the number of values with equal hash. The academic hash map (the one I learnt about at university) uses a bucket to store all elements

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Christian Heimes
Am 07.01.2012 12:02, schrieb Stefan Behnel: Wouldn't Bob Jenkins' lookup3 hash function fit in here? After all, it's portable, known to provide a very good distribution for different string values and is generally fast on both 32 and 64 bit architectures.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Terry Reedy
On 1/7/2012 12:57 PM, Christian Heimes wrote: Am 07.01.2012 12:02, schrieb Stefan Behnel: Admittedly, this may require some adaptation for the PEP393 unicode memory layout in order to produce identical hashes for all three representations if they represent the same content. So it's not a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Mark Shannon
Serhiy Storchaka wrote: 06.01.12 02:10, Nick Coghlan написав(ла): Not a good idea - a lot of the 3rd party tests that depend on dict ordering are going to be using those modules anyway, so scattering our solution across half the standard library is needlessly creating additional work without

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Steven D'Aprano
Glenn Linderman wrote: On 1/5/2012 5:52 PM, Steven D'Aprano wrote: At some point, presuming that there is no speed penalty, the behaviour will surely become not just enabled by default but mandatory. Python has never promised that hashes must be predictable or consistent, so apart from

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Victor Stinner
Using my patch (random-2.patch), the overhead is 0%. I cannot see a difference with and without my patch. Numbers: --- unpatched: == 3 characters == 1 loops, best of 3: 459 usec per loop == 10 characters == 1 loops, best of 3: 575 usec per loop == 500 characters == 1 loops, best of 3: 1.36 msec

[Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Jim Jewett
In http://mail.python.org/pipermail/python-dev/2012-January/115350.html, Mark Shannon wrote: The minimal proposed change of seeding the hash from a global value (a single memory read and an addition) will have such a minimal performance effect that it will be undetectable even on the most

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Mark Shannon
Hi, It seems to me that half the folk discussing this issue want a super-strong, resist-all-hypothetical-attacks hash with little regard to performance. The other half want no change or a change that will have no observable effect. (I may be exaggerating a little.) Can I propose the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Paul Moore
On 6 January 2012 20:25, Mark Shannon m...@hotpy.org wrote: Hi, It seems to me that half the folk discussing this issue want a super-strong, resist-all-hypothetical-attacks hash with little regard to performance. The other half want no change or a change that will have no  observable effect.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Glenn Linderman
On 1/5/2012 4:10 PM, Nick Coghlan wrote: On Fri, Jan 6, 2012 at 8:15 AM, Serhiy Storchakastorch...@gmail.com wrote: 05.01.12 21:14, Glenn Linderman написав(ла): So, fixing the vulnerable packages could be a sufficient response, rather than changing the hash function. How to fix? Each of

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 5 Jan 2012 15:26:27 +1100 Andrew Bennetts and...@bemusement.org wrote: I don't think that's news either. http://mail.python.org/pipermail/python-dev/2003-May/035907.html and http://twistedmatrix.com/pipermail/twisted-python/2003-June/004339.html for instance show that in 2003 it was

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Maciej Fijalkowski
On Thu, Jan 5, 2012 at 3:39 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 5 Jan 2012 15:26:27 +1100 Andrew Bennetts and...@bemusement.org wrote: I don't think that's news either. http://mail.python.org/pipermail/python-dev/2003-May/035907.html and

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 9:34 AM, Maciej Fijalkowski wrote: Also consider that new 2.6.x would go as a security fix to old ubuntu, but all other packages won't, because they'll not contain security fixes. Just so you know Why should CPython by constrained by broken policies of Ubuntu? If the other

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 5 Jan 2012 19:34:13 +0200 Maciej Fijalkowski fij...@gmail.com wrote: Just to make things clear - stdlib itself has 1/64 of tests relying on dict order. Changing dict order in *older* pythons will break everyone's tests and some peoples code. Breaking tests is not a problem: they are

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread David Malcolm
On Thu, 2012-01-05 at 19:34 +0200, Maciej Fijalkowski wrote: On Thu, Jan 5, 2012 at 3:39 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 5 Jan 2012 15:26:27 +1100 Andrew Bennetts and...@bemusement.org wrote: I don't think that's news either.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which it uses a dict to

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 11:49 AM, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Ethan Furman
Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Paul Moore
On 5 January 2012 19:33, David Malcolm dmalc...@redhat.com wrote: We have similar issues in RHEL, with the Python versions going much further back (e.g. 2.3) When backporting the fix to ancient python versions, I'm inclined to turn the change *off* by default, requiring the change to be

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 02:33 PM, David Malcolm wrote: We have similar issues in RHEL, with the Python versions going much further back (e.g. 2.3) When backporting the fix to ancient python versions, I'm inclined to turn the change *off* by default, requiring the change to be enabled via an

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 08:35 PM, Paul Moore wrote: Uh, surely no-one is suggesting backporting to ancient versions? I couldn't find the statement quickly on the python.org website (so this is via google), but isn't it true that 2.6 is in security-only mode and 2.5 and earlier will never get the fix?

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Toshio Kuratomi
On Thu, Jan 05, 2012 at 08:35:57PM +, Paul Moore wrote: On 5 January 2012 19:33, David Malcolm dmalc...@redhat.com wrote: We have similar issues in RHEL, with the Python versions going much further back (e.g. 2.3) When backporting the fix to ancient python versions, I'm inclined to

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread David Malcolm
On Thu, 2012-01-05 at 20:35 +, Paul Moore wrote: On 5 January 2012 19:33, David Malcolm dmalc...@redhat.com wrote: We have similar issues in RHEL, with the Python versions going much further back (e.g. 2.3) When backporting the fix to ancient python versions, I'm inclined to turn

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Georg Brandl
On 01/05/2012 09:45 PM, Barry Warsaw wrote: On Jan 05, 2012, at 02:33 PM, David Malcolm wrote: We have similar issues in RHEL, with the Python versions going much further back (e.g. 2.3) When backporting the fix to ancient python versions, I'm inclined to turn the change *off* by default,

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 21:45, schrieb Barry Warsaw: This sounds like a reasonable compromise for all stable Python releases. It can be turned on by default for Python 3.3. If you also make the default setting easy to change (i.e. parameterized in one place), then distros can make their own decision

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 21:10, schrieb Ethan Furman: Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 05 Jan 2012 22:40:58 +0100 Christian Heimes li...@cheimes.de wrote: Am 05.01.2012 21:45, schrieb Barry Warsaw: This sounds like a reasonable compromise for all stable Python releases. It can be turned on by default for Python 3.3. If you also make the default setting easy to

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 22:59, schrieb Antoine Pitrou: I don't think we (python-dev) are really concerned with 2.3, 2.4, 2.5 and 3.0. They're all unsupported, and people do what they want with their local source trees. Let me reply with a quote from Barry: Correct, although there's no reason why a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Serhiy Storchaka
05.01.12 21:14, Glenn Linderman написав(ла): So, fixing the vulnerable packages could be a sufficient response, rather than changing the hash function. How to fix? Each of those above allocates and returns a dict. Simply have each of those allocate and return and wrapped dict, which has the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 8:15 AM, Serhiy Storchaka storch...@gmail.com wrote: 05.01.12 21:14, Glenn Linderman написав(ла): So, fixing the vulnerable packages could be a sufficient response, rather than changing the hash function.  How to fix?  Each of those above allocates and returns a dict.  

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Terry Reedy
On 1/5/2012 3:10 PM, Ethan Furman wrote: Tres Seaver wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which it uses a dict to hold the form / query string data

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
David Malcolm wrote: When backporting the fix to ancient python versions, I'm inclined to turn the change *off* by default, requiring the change to be enabled via an environment variable: I want to avoid breaking existing code, even if such code is technically relying on non-guaranteed

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 10:40 PM, Christian Heimes wrote: Hey Barry, stop stealing my ideas! :) I've argued for these default settings for days. :) I've suggested the env var PYRANDOMHASH. It's easy to set env vars in Apache. For example Debian/Ubuntu has /etc/apache2/envvars. For consistency, it

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash(abcde)) twice, and see that the calls return different values. (Or have I misunderstood the way the fix is going to work?)

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Victor Stinner
2012/1/6 Barry Warsaw ba...@python.org: Settings for PYRANDOMHASH: PYRANDOMHASH=1   enable randomized hashing function PYRANDOMHASH=/path/to/seed   enable randomized hashing function and read seed from 'seed' PYRANDOMHASH=0   disable randomed hashing function Why not PYTHONHASHSEED

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 06.01.2012 01:34, schrieb Nick Coghlan: On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash(abcde)) twice, and see that the calls return different values. (Or have I

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Benjamin Peterson
2012/1/5 Nick Coghlan ncogh...@gmail.com: On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash(abcde)) twice, and see that the calls return different values. (Or have I

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Fri, 06 Jan 2012 01:50:00 +0100 Christian Heimes li...@cheimes.de wrote: Am 06.01.2012 01:34, schrieb Nick Coghlan: On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 10:59 AM, Benjamin Peterson benja...@python.org wrote: What exactly is the disadvantage of a sys attribute? That would seem preferable to an obscure incarnation like that. Adding sys attributes in maintenance (or security) releases makes me nervous. However, Victor and

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
Benjamin Peterson wrote: 2012/1/5 Nick Coghlan ncogh...@gmail.com: On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash(abcde)) twice, and see that the calls return different

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Benjamin Peterson
2012/1/5 Steven D'Aprano st...@pearwood.info: Benjamin Peterson wrote: 2012/1/5 Nick Coghlan ncogh...@gmail.com: On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano st...@pearwood.info wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash(abcde))

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 06.01.2012 03:04, schrieb Benjamin Peterson: It's obscure because hash('') != 0 doesn't necessarily mean the hashes are randomized. A different hashing algorithm could be in effect. Also in 1 of 2**32 or 2**64 tries hash('') is 0 although randomizing is active. Christian

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
Benjamin Peterson wrote: 2012/1/5 Steven D'Aprano st...@pearwood.info: [...] There's nothing obscure about directly testing the hash. That's about as far from obscure as it is possible to get: you are directly testing the presence of a feature by testing the feature. It's obscure because

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 5:52 PM, Steven D'Aprano wrote: At some point, presuming that there is no speed penalty, the behaviour will surely become not just enabled by default but mandatory. Python has never promised that hashes must be predictable or consistent, so apart from backwards compatibility

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Serhiy Storchaka
06.01.12 02:10, Nick Coghlan написав(ла): Not a good idea - a lot of the 3rd party tests that depend on dict ordering are going to be using those modules anyway, so scattering our solution across half the standard library is needlessly creating additional work without really reducing the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Maciej Fijalkowski
On Wed, Jan 4, 2012 at 12:02 AM, Bill Janssen jans...@parc.com wrote: Christian Heimes li...@cheimes.de wrote: Am 29.12.2011 12:13, schrieb Mark Shannon: The attack relies on being able to predict the hash value for a given string. Randomising the string hash function is quite

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Antoine Pitrou
On Wed, 4 Jan 2012 09:59:15 +0200 Maciej Fijalkowski fij...@gmail.com wrote: Is it *really* a security issue? We knew all along that dicts are O(n^2) in worst case scenario, how is this suddenly a security problem? Because it has been shown to be exploitable for malicious purposes? Regards

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Christian Heimes
Am 04.01.2012 08:59, schrieb Maciej Fijalkowski: Is it *really* a security issue? We knew all along that dicts are O(n^2) in worst case scenario, how is this suddenly a security problem? For example Microsoft has released an extraordinary and unscheduled security patch for the issue between

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Eric Snow
On Wed, Jan 4, 2012 at 12:59 AM, Maciej Fijalkowski fij...@gmail.com wrote: On Wed, Jan 4, 2012 at 12:02 AM, Bill Janssen jans...@parc.com wrote: Christian Heimes li...@cheimes.de wrote: Am 29.12.2011 12:13, schrieb Mark Shannon: The attack relies on being able to predict the hash value for

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Andrew Bennetts
On Wed, Jan 04, 2012 at 11:55:13AM +0100, Antoine Pitrou wrote: On Wed, 4 Jan 2012 09:59:15 +0200 Maciej Fijalkowski fij...@gmail.com wrote: Is it *really* a security issue? We knew all along that dicts are O(n^2) in worst case scenario, how is this suddenly a security problem?

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Barry Warsaw
On Dec 31, 2011, at 04:56 PM, Guido van Rossum wrote: Is there a tracker issue yet? The discussion should probably move there. I think the answer to this was no... until now. http://bugs.python.org/issue13703 Proposed patches should be linked to this issue now. Please nosy yourself if you

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Bill Janssen
Christian Heimes li...@cheimes.de wrote: Am 29.12.2011 12:13, schrieb Mark Shannon: The attack relies on being able to predict the hash value for a given string. Randomising the string hash function is quite straightforward. There is no need to change the dictionary code. A possible

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Terry Reedy
On 1/3/2012 5:02 PM, Bill Janssen wrote: Software that depends on an undefined hash function for synchronization and persistence deserves to break, IMO. There are plenty of well-defined hash functions available for this purpose. The doc for id() now says This is an integer which is

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Terry Reedy
On 1/2/2012 12:55 AM, Paul McMillan wrote: Terry Reedy said: I understood Alexander Klink and Julian Wälde, hash...@alech.de, as saying that they consider that using a random non-zero start value is sufficient to make the hash non-vulnerable. I've been talking to them. They're happy to look

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Antoine Pitrou
On Sun, 1 Jan 2012 21:55:52 -0800 Paul McMillan p...@mcmillan.ws wrote: This is similar to the change proposed by Christian Heimes. Most importantly, I moved the xor with r[x % len_r] down a line. Before, it wasn't being applied to the last character. Shouldn't it be r[i % len(r)] instead?

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Stephen J. Turnbull
Christian Heimes writes: Am 31.12.2011 13:03, schrieb Stephen J. Turnbull: I don't know the implementation issues well enough to claim it is a solution, but this hasn't been mentioned before AFAICS: While the dictionary probe has to start with a hash for backward compatibility

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Christian Heimes
Am 02.01.2012 06:55, schrieb Paul McMillan: I think Ruby uses FNV-1 with a salt, making it less vulnerable to this. FNV is otherwise similar to our existing hash function. For the record, cryptographically strong hash functions are in the neighborhood of 400% slower than our existing hash

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Anders J. Munch
On 1/1/2012 12:28 PM, Christian Heimes wrote: I understood Alexander Klink and Julian Wälde, hash...@alech.de, as saying that they consider that using a random non-zero start value is sufficient to make the hash non-vulnerable. Sufficient against their current attack. But will it last? For

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Christian Heimes
Am 01.01.2012 19:45, schrieb Terry Reedy: On 1/1/2012 10:13 AM, Guido van Rossum wrote: PS. Is the collision-generator used in the attack code open source? As I posted before, Alexander Klink and Julian Wälde gave their project email as hash...@alech.de. Since they indicated disappointment

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Georg Brandl
On 01/02/2012 04:47 PM, Christian Heimes wrote: Am 01.01.2012 19:45, schrieb Terry Reedy: On 1/1/2012 10:13 AM, Guido van Rossum wrote: PS. Is the collision-generator used in the attack code open source? As I posted before, Alexander Klink and Julian Wälde gave their project email as

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Guido van Rossum
On Mon, Jan 2, 2012 at 7:47 AM, Christian Heimes li...@cheimes.de wrote: Am 01.01.2012 19:45, schrieb Terry Reedy: On 1/1/2012 10:13 AM, Guido van Rossum wrote: PS. Is the collision-generator used in the attack code open source? As I posted before, Alexander Klink and Julian Wälde gave

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Alexey Borzenkov
On Mon, Jan 2, 2012 at 7:18 PM, Christian Heimes li...@cheimes.de wrote: Am 02.01.2012 06:55, schrieb Paul McMillan: I think Ruby uses FNV-1 with a salt, making it less vulnerable to this. FNV is otherwise similar to our existing hash function. For the record, cryptographically strong hash

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Alexey Borzenkov
On Mon, Jan 2, 2012 at 10:53 PM, Alexey Borzenkov sna...@gmail.com wrote: On Mon, Jan 2, 2012 at 7:18 PM, Christian Heimes li...@cheimes.de wrote: Am 02.01.2012 06:55, schrieb Paul McMillan: I think Ruby uses FNV-1 with a salt, making it less vulnerable to this. FNV is otherwise similar to our

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Barry Warsaw
On Jan 02, 2012, at 06:38 PM, Georg Brandl wrote: I wouldn't expect too much -- they seem rather keen on cheap laughs: http://twitter.com/#!/bk3n/status/152068096448921600/photo/1/large Heh, so yeah, it won't be me contacting them. -Barry ___

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 10:57 PM, Paul McMillan p...@mcmillan.ws wrote: Still, it would represent an effort for the attacker of a much greater magnitude than the current attack. It's all a trade-off -- at some point it'll just be easier for the attacker to use some other vulnerability.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
Different concern. What if someone were to have code implementing an external, persistent hash table, using Python's hash() function? They might have a way to rehash everything when a new version of Python comes along, but they would not be happy if hash() is different in each process. I somehow

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
PS. Is the collision-generator used in the attack code open source? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 16:13, schrieb Guido van Rossum: Different concern. What if someone were to have code implementing an external, persistent hash table, using Python's hash() function? They might have a way to rehash everything when a new version of Python comes along, but they would not be happy

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 31.12.2011 23:38, schrieb Terry Reedy: On 12/31/2011 4:43 PM, PJ Eby wrote: Here's an idea. Suppose we add a sys.hash_seed or some such, that's settable to an int, and defaults to whatever we're using now. Then programs that want a fix can just set it to a random number, I do not

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 00:56, schrieb Guido van Rossum: ISTM the only reasonable thing is to have a random seed picked very early in the process, to be used to change the hash() function of str/bytes/unicode (in a way that they are still compatible with each other). The seed should be unique per

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 05:11, schrieb Antoine Pitrou: On Sat, 31 Dec 2011 16:56:00 -0700 Guido van Rossum gu...@python.org wrote: ISTM the only reasonable thing is to have a random seed picked very early in the process, to be used to change the hash() function of str/bytes/unicode (in a way that they

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
On Sun, 01 Jan 2012 16:48:32 +0100 Christian Heimes li...@cheimes.de wrote: The talkers claim and have shown that it's too easy to pre-calculate collisions with hashing algorithms similar to DJBX33X / DJBX33A. It might be a good idea to change the hashing algorithm, too. Paul as listed some

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
On Sun, 01 Jan 2012 16:56:19 +0100 Christian Heimes li...@cheimes.de wrote: Am 01.01.2012 05:11, schrieb Antoine Pitrou: On Sat, 31 Dec 2011 16:56:00 -0700 Guido van Rossum gu...@python.org wrote: ISTM the only reasonable thing is to have a random seed picked very early in the process, to

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 06:57, schrieb Paul McMillan: I agree that doing anything is better than doing nothing. If we use the earlier suggestion and prepend everything with a fixed-length seed, we need quite a bit of entropy (and so a fairly long string) to make a lasting difference. Your code at

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 17:09, schrieb Antoine Pitrou: On Sun, 01 Jan 2012 16:48:32 +0100 Christian Heimes li...@cheimes.de wrote: The talkers claim and have shown that it's too easy to pre-calculate collisions with hashing algorithms similar to DJBX33X / DJBX33A. It might be a good idea to change the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
Le dimanche 01 janvier 2012 à 17:34 +0100, Christian Heimes a écrit : Am 01.01.2012 17:09, schrieb Antoine Pitrou: On Sun, 01 Jan 2012 16:48:32 +0100 Christian Heimes li...@cheimes.de wrote: The talkers claim and have shown that it's too easy to pre-calculate collisions with hashing

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 17:54, schrieb Antoine Pitrou: I don't understand. FNV-1 multiplies the current running result with a prime and then xors it with the following byte. This is also what we do. (I'm assuming 103 is prime) There must be a major difference somewhere inside the algorithm. The talk

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 01:11, schrieb Guido van Rossum: FWIW I managed to build Python 2.6, and a trivial mutation of the string/unicode hash function (add 1 initially) made only three tests fail; test_symtable and test_json both have a dependency on dictionary order, test_ctypes I can't quite figure

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Victor Stinner
Le 01/01/2012 04:29, Paul McMillan a écrit : This is incorrect. Once an attacker has guessed the random seed, any operation which reveals the ordering of hashed objects can be used to verify the answer. JSON responses would be ideal. In fact, an attacker can do a brute-force attack of the random

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Terry Reedy
On 1/1/2012 10:13 AM, Guido van Rossum wrote: PS. Is the collision-generator used in the attack code open source? As I posted before, Alexander Klink and Julian Wälde gave their project email as hash...@alech.de. Since they indicated disappointment in not hearing from Python, I presume they

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Terry Reedy
On 1/1/2012 12:28 PM, Christian Heimes wrote: Am 01.01.2012 17:54, schrieb Antoine Pitrou: I don't understand. FNV-1 multiplies the current running result with a prime and then xors it with the following byte. This is also what we do. (I'm assuming 103 is prime) There must be a major

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett
Steven D'Aprano (in http://mail.python.org/pipermail/python-dev/2011-December/115162.html) wrote: By compile-time, do you mean when the byte-code is compilated, i.e. just before runtime, rather than a switch when compiling the Python executable from source? No. I really mean when the C code

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
But how about precomputing the intermediate value (x)? The hash is (mostly) doing x = f(x, c) for each c in the input. That's a fair point. If we go down that avenue, I think simply choosing a random fixed starting value for x is the correct choice, rather than computing an intermediate value.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
Different concern. What if someone were to have code implementing an external, persistent hash table, using Python's hash() function? They might have a way to rehash everything when a new version of Python comes along, but they would not be happy if hash() is different in each process. I

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett
Paul McMillan in http://mail.python.org/pipermail/python-dev/2012-January/115183.html wrote: Guido van Rossum wrote: Hm. I'm not sure I like the idea of extra arithmetic for every character being hashed. the collision generator doesn't necessarily vary the length of the string.

[Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Jim Jewett
Victor Stinner wrote in http://mail.python.org/pipermail/python-dev/2012-January/115198.html If we want to protect a website against this attack for example, we must suppose that the attacker can inject arbitrary data and can get (indirectly) the result of hash(str) (e.g. with the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
I fixed a couple things in my proposed algorithm: https://gist.github.com/0a91e52efa74f61858b5 I had a typo, and used 21 instead of 12 for the size multiplier. We definitely don't need 2MB random data. The initialization of r was broken. Now it is an array of ints, so there's no conversion when

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Stephen J. Turnbull
Victor Stinner writes: Let's try to summarize this vulnerability. The creation of a Python dictionary has a complexity of O(n) in most cases, but O(n^2) in the *worst* case. The attack tries to go into the worst case. It requires to compute a set of N keys having the same hash

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Christian Heimes
Am 31.12.2011 13:03, schrieb Stephen J. Turnbull: I don't know the implementation issues well enough to claim it is a solution, but this hasn't been mentioned before AFAICS: While the dictionary probe has to start with a hash for backward compatibility reasons, is there a reason the overflow

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread martin
Zitat von Victor Stinner victor.stin...@haypocalc.com: The current implementation of dict uses a linked-list. [...] Tell me if I am wrong. At least with this statement, you are wrong: the current implementation does *not* use a linked-list. Instead, it uses open addressing. Regards,

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby
On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull step...@xemacs.orgwrote: While the dictionary probe has to start with a hash for backward compatibility reasons, is there a reason the overflow strategy for insertion has to be buckets containing lists? How about double-hashing, etc?

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Jeffrey Yasskin
On Wed, Dec 28, 2011 at 5:37 PM, Jesse Noller jnol...@gmail.com wrote: On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote: Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby
On Sat, Dec 31, 2011 at 4:04 PM, Jeffrey Yasskin jyass...@gmail.com wrote: Hash functions are already unstable across Python versions. Making them unstable across interpreter processes (multiprocessing doesn't share dicts, right?) doesn't sound like a big additional problem. Users who want a

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Terry Reedy
On 12/31/2011 4:43 PM, PJ Eby wrote: Here's an idea. Suppose we add a sys.hash_seed or some such, that's settable to an int, and defaults to whatever we're using now. Then programs that want a fix can just set it to a random number, I do not think we can allow that to change once there are

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
ISTM the only reasonable thing is to have a random seed picked very early in the process, to be used to change the hash() function of str/bytes/unicode (in a way that they are still compatible with each other). The seed should be unique per process except it should survive fork() (but not

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 4:56 PM, Guido van Rossum gu...@python.org wrote: PS. I would propose a specific fix but I can't seem to build a working CPython from the trunk on my laptop (OS X 10.6, Xcode 4.1). I get this error late in the build: ./python.exe -SE -m sysconfig --generate-posix-vars

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Paul McMillan
I'm not too concerned about a 3rd party being able to guess the random seed -- this would require much more effort on their part, since they would have to generate a new set of colliding keys each time they think they have guessed the hash This is incorrect. Once an attacker has guessed the

  1   2   >