[issue13703] Hash collision security issue

2012-01-04 Thread Huzaifa Sidhpurwala
Huzaifa Sidhpurwala added the comment: I am wondering if a CVE id has been assigned to this security issue yet? -- nosy: +Huzaifa.Sidhpurwala ___ Python tracker ___

[issue13714] Methods of ftplib never ends if the ip address changes

2012-01-04 Thread Sworddragon
New submission from Sworddragon : If a client gets a reconnect and a new ip from the provider the methods of ftplib can't handle this and are hanging in an infinite loop. For example if a file is transfered with storbinary() and the client gets a new ip address the script will never end. I'm u

[issue13712] pysetup create should not convert package_data to extra_files

2012-01-04 Thread Éric Araujo
Éric Araujo added the comment: The bug is caused by code in packaging.create that iterates over a dict (package_data) to extend a list (extra_files). Instead of just calling sorted to make output deterministic, I’d prefer to fix that more serious behavior bug (see also #13463, #11805 and #53

[issue13703] Hash collision security issue

2012-01-04 Thread Christian Heimes
Christian Heimes added the comment: In reply to MAL's message http://bugs.python.org/issue13703#msg150616 > 2. Changing the semantics of hashing in a dot release is not allowed. I concur with Marc. The change is too intrusive and may cause too much trouble for the issue. Also it seems to be u

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: Patch version 2: - hash("") is always 0 - Remove a debug message -- Added file: http://bugs.python.org/file24143/random-2.patch ___ Python tracker __

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: > At least for Python 2.x hash(str) and hash(unicode) have to yield > the same result for ASCII only strings. Ah yes, I forgot Python 2: I wrote my patch for Python 3.3. The two hash functions should be modified to be randomized. > hash("") should always ret

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: "Calculating the hash of a null byte gives you the xor of your two seeds." Not directly because prefix is first multiplied by 103. So hash("\0") gives you (prefix*103) % 2^32 ^ suffix. Example: $ ./python secret={b7abfbbf, db6cbb4d} Python 3.3.0a0 (

[issue13703] Hash collision security issue

2012-01-04 Thread Christian Heimes
Christian Heimes added the comment: Paul wrote: > I suggest that we don't randomize strings shorter than 6 characters. For > longer strings, we randomize the first and last 5 characters. This means > we're only adding additional work to a max of 10 rounds of the hash, and only > for longer st

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: > - for small strings we could use a different seed than for larger strings Or just leave them unseeded with our existing algorithm. Shifting them into a different part of the hash space doesn't really gain us much. > - for larger strings we could use Paul's al

[issue13703] Hash collision security issue

2012-01-04 Thread Christian Heimes
Christian Heimes added the comment: Given that a user has an application with an oracle function that returns the hash of a unicode string, an attacker can probe tenth of thousand one and two character unicode strings. That should give him/her enough data to calculate both seeds. hash("") alr

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: > I fear that an attacker may guess the seed from several small strings hash(a) ^ hash(b) "removes" the suffix, but I don't see how to guess the prefix from this new value. It doesn't mean that it is not possible, just that I don't have a strong background in

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: > My proposition only adds two XOR to hash(str) (outside the loop on Unicode > characters), so I expect a ridiculous overhead. I don't know yet how hard it > is to guess the secret from hash(str) output. It doesn't work much better than a single random seed. C

[issue13703] Hash collision security issue

2012-01-04 Thread Christian Heimes
Christian Heimes added the comment: Thanks Victor! > - hash(str) is now randomized using two random Py_hash_t values: > don't touch the critical loop, only add a prefix and a suffix At least for Python 2.x hash(str) and hash(unicode) have to yield the same result for ASCII only strings. >

[issue13699] test_gdb has recently started failing

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: > New changeset dfffb293f4b3 by Vinay Sajip in branch 'default' The fix should also be applied to 3.2. -- resolution: fixed -> status: closed -> open ___ Python tracker

[issue13703] Hash collision security issue

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > You aren't special casing small strings. I fear that an attacker may > guess the seed from several small strings. How would (s)he do? -- ___ Python tracker

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: "Since speed is a concern, I think that the proposal to avoid using the random hash for short strings is a good idea." My proposition only adds two XOR to hash(str) (outside the loop on Unicode characters), so I expect a ridiculous overhead. I don't know yet

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: This is not something that can be fixed by limiting the size of POST/GET. Parsing documents (even offline) can generate these problems. I can create books that calibre (a Python-based ebook format shifting tool) can't convert, but are otherwise perfectly vali

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: > If PHP uses it, I'm confident it is secure. If I remember correctly, it is only used for the Windows version of PHP, but PHP doesn't implement it correctly because it uses all bits. -- ___ Python tracker

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: +printf("read %i bytes\n", size); Oops, I forgot a debug message. -- ___ Python tracker ___ __

[issue13703] Hash collision security issue

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) > > or /dev/urandom > > Oh, OpenSSL (RAND_pseudo_bytes) should be used on Windows, Linux, Mac > OS X, etc. if OpenSSL is available. Apart from the large dependency, the OpenSSL license is not GPL-comp

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: > add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) > or /dev/urandom Oh, OpenSSL (RAND_pseudo_bytes) should be used on Windows, Linux, Mac OS X, etc. if OpenSSL is available. I was just too lazy to add a define or pyconfig.h option to indicate if OpenS

[issue13703] Hash collision security issue

2012-01-04 Thread STINNER Victor
STINNER Victor added the comment: Work-in-progress patch implementing my randomized hash function (random.patch): - add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) or /dev/urandom, will a fallback on a dummy LCG if the OS urandom failed - posix.urandom() is always defined and reuses P

[issue13641] decoding functions in the base64 module could accept unicode strings

2012-01-04 Thread Berker Peksag
Berker Peksag added the comment: Hi Antoine, I added some tests for b64decode function. Also, I wrote some tests for b32decode and b16decode functions and failed. I think my patch is not working for b32decode and b16decode functions. I'll dig into code and try to find a way. Thanks! --

[issue10772] Several actions for argparse arguments missing from docs

2012-01-04 Thread Sandro Tosi
Sandro Tosi added the comment: Thanks Marc for the patch, I've just committed it. -- resolution: -> fixed stage: commit review -> committed/rejected status: open -> closed ___ Python tracker _

[issue10772] Several actions for argparse arguments missing from docs

2012-01-04 Thread Roundup Robot
Roundup Robot added the comment: New changeset 278fbd7b9608 by Sandro Tosi in branch '2.7': Issue #10772: add count and help argparse action; patch by Marc Sibson http://hg.python.org/cpython/rev/278fbd7b9608 New changeset 326f755962e3 by Sandro Tosi in branch '3.2': Issue #10772: add count and

[issue11648] openlog()s 'logopt' keyword broken in syslog module

2012-01-04 Thread Sandro Tosi
Sandro Tosi added the comment: This has already been fixed with 71f7175e2b34 & friends. -- nosy: +sandro.tosi resolution: -> fixed stage: -> committed/rejected status: open -> closed versions: -Python 3.4 ___ Python tracker

[issue7098] g formatting for decimal types should always strip trailing zeros.

2012-01-04 Thread Stefan Krah
Stefan Krah added the comment: [Mark] > So I think the current code is correct. I agree with this. Currently the 'g' format is like to_sci_string() with the added possibility of adjusting the number of significant digits. It's probably hard to come up with a better way to handle this for Decima

[issue13713] Regression for http.client read()

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Ok, committed! (Jon, don't worry, such things happen :-)) -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker

[issue13713] Regression for http.client read()

2012-01-04 Thread Roundup Robot
Roundup Robot added the comment: New changeset 4b21f651 by Antoine Pitrou in branch 'default': Issue #13713: fix a regression in HTTP chunked reading after 806cfe39f729 http://hg.python.org/cpython/rev/4b21f651 -- nosy: +python-dev ___ Python

[issue13464] HTTPResponse is missing an implementation of readinto

2012-01-04 Thread Roundup Robot
Roundup Robot added the comment: New changeset 4b21f651 by Antoine Pitrou in branch 'default': Issue #13713: fix a regression in HTTP chunked reading after 806cfe39f729 http://hg.python.org/cpython/rev/4b21f651 -- ___ Python tracker

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Marc-Andre Lemburg wrote: > > 1. The security implications in all this is being somewhat overemphasized. > > There are many ways you can do a DoS attack on web servers. It's the > responsibility of the used web frameworks and servers to deal with > the pos

[issue13713] Regression for http.client read()

2012-01-04 Thread Ross Lagerwall
Ross Lagerwall added the comment: The patch looks right and seems to fix the issue. Thanks :-) -- ___ Python tracker ___ ___ Python-b

[issue13713] Regression for http.client read()

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: The fix is quite trivial. Here is a patch + tests. -- keywords: +patch stage: needs patch -> patch review Added file: http://bugs.python.org/file24140/readinto_chunked.patch ___ Python tracker

[issue13703] Hash collision security issue

2012-01-04 Thread Alex Gaynor
Alex Gaynor added the comment: Except, it's a totally non-scalable approach. People have vulnerabilities all over their sites which they don't realize. Some examples: django-taggit (an application I wrote for handling tags) parses tags out an input, it stores these in a set to check for dup

[issue13703] Hash collision security issue

2012-01-04 Thread Terry J. Reedy
Terry J. Reedy added the comment: To expand on Marc-Andre's point 1: the DOS attack on web servers is possible because servers are generally dumb at the first stage. Upon receiving a post request, all key=value pairs are mindlessly packaged into a hash table that is then passed on to a page h

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The email interface ate part of my reply: >>> g = ((x*(2**64 - 1), hash(x*(2**64 - 1))) for x in xrange(1, 100)) >>> s = ''.join(str(x) for x in g) >>> len(s) 32397634 >>> g = ((x*(2**64 - 1), hash(x*(2**64 - 1))) for x in xrange(1, 100)) >>> d = di

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Marc-Andre Lemburg wrote: > > 3. Changing the way strings are hashed doesn't solve the problem. > > Hash values of other types can easily be guessed as well, e.g. > take integers which use a trivial hash function. Here's an example for integers on a 64-bi

[issue13713] Regression for http.client read()

2012-01-04 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- nosy: +Jon.Kuhn priority: normal -> critical ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue13712] test_packaging depends on hash order

2012-01-04 Thread Éric Araujo
Éric Araujo added the comment: Thanks, I will check this. -- versions: +3rd party ___ Python tracker ___ ___ Python-bugs-list mailing

[issue12660] test_gdb fails when installed

2012-01-04 Thread Vinay Sajip
Vinay Sajip added the comment: Pending the real fix, I've attached a patch to skip the test if it's not a source build. -- keywords: +patch nosy: +vinay.sajip stage: needs patch -> patch review Added file: http://bugs.python.org/file24139/test-gdb-patch.diff __

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Some comments: 1. The security implications in all this is being somewhat overemphasized. There are many ways you can do a DoS attack on web servers. It's the responsibility of the used web frameworks and servers to deal with the possible cases. It's a go

[issue13713] Regression for http.client read()

2012-01-04 Thread Ross Lagerwall
New submission from Ross Lagerwall : 806cfe39f729 introduced a regression for http.client read(len). To see this: $ ./python test.py $ wget http://archives.fedoraproject.org/pub/archive/fedora/linux/core/1/SRPMS/ $ diff index.html index2.html This is a difference in the files (which there shoul

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Manuel Bärenz
Manuel Bärenz added the comment: Great! Thank you! -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://m

[issue13703] Hash collision security issue

2012-01-04 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: On Jan 04, 2012, at 06:00 AM, Paul McMillan wrote: >Developers would be startled to find that ordering stays consistent on a 64 >bit build but varies on 32 bit builds. Well, one positive outcome of this issue is that users will finally viscerally understand t

[issue13704] Random number generator in Python core

2012-01-04 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: On Jan 04, 2012, at 07:30 AM, Raymond Hettinger wrote: >Why is this listed as a release blocker? It is questionable whether it >should be done at all? It is a very aggressive change. It's a release blocker so that the issue won't get ignored before the next

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Ezio Melotti
Ezio Melotti added the comment: Yep, this was fixed in #670664. With the development version of Python (AFAIK the fix has not be released yet) and the example parser found in the doc[0] I get this: >>> parser = MyHTMLParser() >>> parser.feed('') Encountered a start tag: script Encount

[issue13712] test_packaging depends on hash order

2012-01-04 Thread Christian Heimes
New submission from Christian Heimes : As requested in http://bugs.python.org/issue13703#msg150609 ./python Lib/test/regrtest.py test_packaging [1/1] test_packaging Warning -- threading._dangling was modified by test_packaging Warning -- sysconfig._SCHEMES was modified by test_packaging test tes

[issue13703] Hash collision security issue

2012-01-04 Thread Éric Araujo
Éric Araujo added the comment: If test_packaging fails because it relies on dict order / hash details, that’s a bug. Can you copy the full tb (possibly in another report, I can fix it independently of this issue)? -- nosy: +eric.araujo ___ Python

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread R. David Murray
R. David Murray added the comment: I believe this was fixed recently as part of issue 670664. Ezio will know for sure. -- ___ Python tracker ___ __

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Manuel Bärenz
Manuel Bärenz added the comment: To clarify this even further: Consider parser_instance.feed("") It should call: parser_instance.handle_starttag("script", []) parser_instance.handle_data("") parser_instance.handle_endtag("script", []) Instead, it calls: parser_instance.handle_starttag

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Manuel Bärenz
Manuel Bärenz added the comment: Oh, I wasn't aware of that. Then, the bug is actually calling handle_endtag. -- ___ Python tracker ___ _

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread R. David Murray
R. David Murray added the comment: The content of a script tag is CDATA. Why would you expect it to be parsed? -- nosy: +ezio.melotti, r.david.murray ___ Python tracker ___ ___

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Manuel Bärenz
Manuel Bärenz added the comment: I forgot to say, I'm using python version 3.2.2. -- ___ Python tracker ___ ___ Python-bugs-list mail

[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly

2012-01-04 Thread Manuel Bärenz
New submission from Manuel Bärenz : I've attached a script which demonstrates the bug. When feeding a

[issue13699] test_gdb has recently started failing

2012-01-04 Thread Roundup Robot
Roundup Robot added the comment: New changeset dfffb293f4b3 by Vinay Sajip in branch 'default': Closes #13699. Skipped two tests if Python is optimised. http://hg.python.org/cpython/rev/dfffb293f4b3 -- nosy: +python-dev resolution: -> fixed stage: -> committed/rejected status: open ->

[issue13703] Hash collision security issue

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Using a fairly small value (4k) should not make the results much worse > from a security perspective, but might be problematic from a > collision/distribution standpoint. Keep in mind the average L1 data cache size is between 16KB and 64KB. 4KB is already a

[issue13697] python RLock implementation unsafe with signals

2012-01-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > That sounds like a good solution in the middle-term. Are there any > > drawbacks? (apart from launching a thread) > > Just to be clear: the approach I was suggesting is to have a resident > thread dedicated to signal management, not to spawn a new one when

[issue13703] Hash collision security issue

2012-01-04 Thread Mark Shannon
Changes by Mark Shannon : -- nosy: +Mark.Shannon ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue13707] Clarify hash() constancy period

2012-01-04 Thread Raymond Hettinger
Raymond Hettinger added the comment: [Antoine] > Suggest closing as invalid/rajected. [Martin] > -1. The hash has nothing to do with the lifetime, > but with the value of an object. -- resolution: -> invalid status: open -> closed ___ Python track

[issue13704] Random number generator in Python core

2012-01-04 Thread Raymond Hettinger
Changes by Raymond Hettinger : -- assignee: rhettinger -> christian.heimes ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue13704] Random number generator in Python core

2012-01-04 Thread Christian Heimes
Christian Heimes added the comment: Release blocker: I was following the example in #13703. A RNG (PRNG or CSPRNG) is required for randomized hashing function. The patch contains more than just the RNG changes. Only Include/pyrandom.h, Modules/_randommodule.c, Modules/posixmodule.c, Python/ha

[issue8416] python 2.6.5 documentation can't search

2012-01-04 Thread Georg Brandl
Georg Brandl added the comment: The continually updated docs are built from the stable branches, whose version remains at (e.g.) 2.7.2 until 2.7.3a1 is released, at which point the continuous updating stops until 2.7.3 is final. I don't think presenting docs with an alpha version on the http

[issue13707] Clarify hash() constancy period

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Terry J. Reedy wrote: > > Terry J. Reedy added the comment: > > Martin, I do not understand. The default hash is based on id (as is default > equality comparison), not value. Are you OK with hash values changing if the > 'value' changes? My understandin

[issue13707] Clarify hash() constancy period

2012-01-04 Thread Martin v . Löwis
Martin v. Löwis added the comment: > Martin, I do not understand. The default hash is based on id (as is > default equality comparison), not value. In the default implementation, the id *is* the object's value (i.e. objects, by default, only compare equal if they are identical). So the default