[issue13707] Clarify hash() constancy period
Martin v. Löwis mar...@v.loewis.de added the comment: Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. In the default implementation, the id *is* the object's value (i.e. objects, by default, only compare equal if they are identical). So the default implementation is just a special case of the more general rule that hashes need to be consistent with equality. Are you OK with hash values changing if the 'value' changes? An object that can change its value (i.e. a mutable object) should fail to hash. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13707 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13707] Clarify hash() constancy period
Marc-Andre Lemburg m...@egenix.com added the comment: Terry J. Reedy wrote: Terry J. Reedy tjre...@udel.edu added the comment: Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value' changes? My understanding is that changing hash values for objects in sets and dicts is bad, which is why mutable builtins with value-based equality do not have hash values. Hash values are based on the object values, not their id(). See the various type implementations as reference. The id() is only used as hash for objects which don't have a value (and thus cannot be compared). Given that we have the invariant a==b = hash(a)==hash(b) in Python, it immediately follows that hash values for objects with comparison method cannot have a lifetime - at least not within the same process and, depending how you look at it, also not in multi-process applications. -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13707 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8416] python 2.6.5 documentation can't search
Georg Brandl ge...@python.org added the comment: The continually updated docs are built from the stable branches, whose version remains at (e.g.) 2.7.2 until 2.7.3a1 is released, at which point the continuous updating stops until 2.7.3 is final. I don't think presenting docs with an alpha version on the http://docs.python.org/ frontpage is useful. On the other hand, I do think it is important to have doc fixed reflected (more or less) instantly somewhere, so that e.g. people reporting typos can see the fixes. The status quo is a compromise between these two needs. When we do make backwards incompatible changes or additions during a stable cycle, they need to be marked with new/changed in version 2.7.X+1 anyway. So the SequenceMatcher change would alert users itself. If not, that's a bug. About the obsolete snapshots: I don't know what you're referring to there: if it's the released docs for specific versions, then I think that's standard practice to have a doc version released for a specific Python version; and I wouldn't change that. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8416 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13704] Random number generator in Python core
Christian Heimes li...@cheimes.de added the comment: Release blocker: I was following the example in #13703. A RNG (PRNG or CSPRNG) is required for randomized hashing function. The patch contains more than just the RNG changes. Only Include/pyrandom.h, Modules/_randommodule.c, Modules/posixmodule.c, Python/hash.c and parts of Makefile.pre.in are relevant for this tracker item. Sorry for the inconvenience! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13704 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13704] Random number generator in Python core
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: rhettinger - christian.heimes ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13704 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13707] Clarify hash() constancy period
Raymond Hettinger raymond.hettin...@gmail.com added the comment: [Antoine] Suggest closing as invalid/rajected. [Martin] -1. The hash has nothing to do with the lifetime, but with the value of an object. -- resolution: - invalid status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13707 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Changes by Mark Shannon m...@hotpy.org: -- nosy: +Mark.Shannon ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13697] python RLock implementation unsafe with signals
Antoine Pitrou pit...@free.fr added the comment: That sounds like a good solution in the middle-term. Are there any drawbacks? (apart from launching a thread) Just to be clear: the approach I was suggesting is to have a resident thread dedicated to signal management, not to spawn a new one when needed. Another advantage is that we could mask signals in all threads except this one, and have a consistent cross-platform behavior with respect to signals+threads. Hmm, but that would break single-threaded programs which expect their select() (or other) to return EINTR when a signal is received (which is a perfectly valid expectation in that case). However I see two drawbacks: - it seems that we want to allow building Python without threads support. In that case, this wouldn't work, or we would need the current implementation as a fallback, but this would complicate the code somewhat. I don't know if that's still useful to build Python without threads. I would expect most platforms to have a compatible threads implementation (and Python probably can't run on very small embedded platforms). Perhaps you can ask on python-dev. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13697 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Antoine Pitrou pit...@free.fr added the comment: Using a fairly small value (4k) should not make the results much worse from a security perspective, but might be problematic from a collision/distribution standpoint. Keep in mind the average L1 data cache size is between 16KB and 64KB. 4KB is already a significant chunk of that. Given a hash function's typical loop is to feed back the current result into the next computation, I don't see why a small value (e.g. 256 bytes) would be detrimental. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13699] test_gdb has recently started failing
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset dfffb293f4b3 by Vinay Sajip in branch 'default': Closes #13699. Skipped two tests if Python is optimised. http://hg.python.org/cpython/rev/dfffb293f4b3 -- nosy: +python-dev resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13699 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
New submission from Manuel Bärenz man...@enigmage.de: I've attached a script which demonstrates the bug. When feeding a script that contains a comment tag with the actual script and the script containing tags itself (e.g. a 'document.write(td/td)'), the parser doesn't call handle_comment and handle_starttag. -- components: Library (Lib) files: htmlparserbug.py messages: 150603 nosy: turion priority: normal severity: normal status: open title: html.parser.HTMLParser doesn't parse tags in comments in scripts correctly type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file24137/htmlparserbug.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
Manuel Bärenz man...@enigmage.de added the comment: I forgot to say, I'm using python version 3.2.2. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
R. David Murray rdmur...@bitdance.com added the comment: The content of a script tag is CDATA. Why would you expect it to be parsed? -- nosy: +ezio.melotti, r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
Manuel Bärenz man...@enigmage.de added the comment: Oh, I wasn't aware of that. Then, the bug is actually calling handle_endtag. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
Manuel Bärenz man...@enigmage.de added the comment: To clarify this even further: Consider parser_instance.feed(scripttd/td/script) It should call: parser_instance.handle_starttag(script, []) parser_instance.handle_data(td/td) parser_instance.handle_endtag(script, []) Instead, it calls: parser_instance.handle_starttag(script, []) parser_instance.handle_data(td) parser_instance.handle_endtag(td, []) parser_instance.handle_endtag(script, []) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
R. David Murray rdmur...@bitdance.com added the comment: I believe this was fixed recently as part of issue 670664. Ezio will know for sure. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Éric Araujo mer...@netwok.org added the comment: If test_packaging fails because it relies on dict order / hash details, that’s a bug. Can you copy the full tb (possibly in another report, I can fix it independently of this issue)? -- nosy: +eric.araujo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13712] test_packaging depends on hash order
New submission from Christian Heimes li...@cheimes.de: As requested in http://bugs.python.org/issue13703#msg150609 ./python Lib/test/regrtest.py test_packaging [1/1] test_packaging Warning -- threading._dangling was modified by test_packaging Warning -- sysconfig._SCHEMES was modified by test_packaging test test_packaging failed -- Traceback (most recent call last): File /home/heimes/dev/python/randomhash/Lib/packaging/tests/test_create.py, line 168, in test_convert_setup_py_to_cfg )) AssertionError: '[metadata]\nname = pyxfoil\nversion = 0.2\nsummary = Python bindings for the Xf [truncated]... != '[metadata]\nname = pyxfoil\nversion = 0.2\nsummary = Python bindings for the Xf [truncated]... [metadata] name = pyxfoil version = 0.2 summary = Python bindings for the Xfoil engine download_url = UNKNOWN home_page = http://www.python-science.org/project/pyxfoil maintainer = André Espaze maintainer_email = andre.esp...@logilab.fr description = My super Death-scription |barbar is now on the public domain, |ho, baby ! [files] packages = pyxfoil babar me modules = my_lib mymodule scripts = my_script bin/run - extra_files = setup.py + extra_files = Martinique/Lamentin/dady + Martinique/Lamentin/mumy + Martinique/Lamentin/sys + Martinique/Lamentin/bro + setup.py README - pyxfoil/fengine.so Pom Flora Alexander + pyxfoil/fengine.so - Martinique/Lamentin/dady - Martinique/Lamentin/mumy - Martinique/Lamentin/sys - Martinique/Lamentin/bro resources = README.rst = {doc} pyxfoil.1 = {man} 1 test failed: test_packaging -- assignee: eric.araujo components: Distutils2 messages: 150610 nosy: alexis, christian.heimes, eric.araujo priority: normal severity: normal status: open title: test_packaging depends on hash order type: behavior versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13712 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
Ezio Melotti ezio.melo...@gmail.com added the comment: Yep, this was fixed in #670664. With the development version of Python (AFAIK the fix has not be released yet) and the example parser found in the doc[0] I get this: parser = MyHTMLParser() parser.feed('scripttd/td/script') Encountered a start tag: script Encountered some data: td/td Encountered an end tag: script [0]: http://docs.python.org/dev/library/html.parser.html#example-html-parser-application -- assignee: - ezio.melotti resolution: - duplicate stage: - committed/rejected status: open - closed superseder: - HTMLParser.py - more robust SCRIPT tag parsing ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13704] Random number generator in Python core
Barry A. Warsaw ba...@python.org added the comment: On Jan 04, 2012, at 07:30 AM, Raymond Hettinger wrote: Why is this listed as a release blocker? It is questionable whether it should be done at all? It is a very aggressive change. It's a release blocker so that the issue won't get ignored before the next release. That doesn't necessarily mean it must be fixed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13704 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Barry A. Warsaw ba...@python.org added the comment: On Jan 04, 2012, at 06:00 AM, Paul McMillan wrote: Developers would be startled to find that ordering stays consistent on a 64 bit build but varies on 32 bit builds. Well, one positive outcome of this issue is that users will finally viscerally understand that dictionary (and set) order should never be relied upon, even between successive runs of the same Python executable. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13711] html.parser.HTMLParser doesn't parse tags in comments in scripts correctly
Manuel Bärenz man...@enigmage.de added the comment: Great! Thank you! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13711 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
New submission from Ross Lagerwall rosslagerw...@gmail.com: 806cfe39f729 introduced a regression for http.client read(len). To see this: $ ./python test.py $ wget http://archives.fedoraproject.org/pub/archive/fedora/linux/core/1/SRPMS/ $ diff index.html index2.html This is a difference in the files (which there shouldn't be). The change which introduced the problem was: changeset: 73875:806cfe39f729 user:Antoine Pitrou solip...@pitrou.net date:Tue Dec 06 22:33:57 2011 +0100 summary: Issue #13464: Add a readinto() method to http.client.HTTPResponse. -- components: Library (Lib) files: test.py messages: 150615 nosy: orsenthil, pitrou, rosslagerwall priority: normal severity: normal stage: needs patch status: open title: Regression for http.client read() type: behavior versions: Python 3.3 Added file: http://bugs.python.org/file24138/test.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Marc-Andre Lemburg m...@egenix.com added the comment: Some comments: 1. The security implications in all this is being somewhat overemphasized. There are many ways you can do a DoS attack on web servers. It's the responsibility of the used web frameworks and servers to deal with the possible cases. It's a good idea to provide some way to protect against hash collision attacks, but that will only solve one possible way of causing a resource attack on a server. There are other ways you can generate lots of CPU overhead with little data input (e.g. think of targeting the search feature on many Zope/Plone sites). In order to protect against such attacks in general, we'd have to provide a way to control CPU time and e.g. raise an exception if too much time is being spent on a simple operation such as a key insertion. This can be done using timers, signals or even under OS control. The easiest way to protect against the hash collision attack is by limiting the POST/GET/HEAD request size. The second best way would be to limit the number of parameters that a web framework accepts for POST/GET/HEAD request. 2. Changing the semantics of hashing in a dot release is not allowed. If randomization of the hash start vector or some other method is enabled by default in a dot release, this will change the semantics of any application switching to that dot release. The hash values of Python objects are not only used by the Python dictionary implementation, but also by other storage mechanisms such as on-disk dictionaries, inter-process object exchange via share memory, memcache, etc. Hence, if changed, the hash change should be disabled per default for dot releases and enabled for 3.3. 3. Changing the way strings are hashed doesn't solve the problem. Hash values of other types can easily be guessed as well, e.g. take integers which use a trivial hash function. We'd have to adapt all hash functions of the basic types in Python or come up with a generic solution using e.g. double-hashing in the dictionary/set implementations. 4. By just using a random start vector you change the absolute hash values for specific objects, but not the overall hash sequence or its period. An attacker only needs to create many hash collisions, not specific ones. It's the period of the hash function that's important in such attacks and that doesn't change when moving to a different start vector. 5. Hashing needs to be fast. It's one of the most used operations in Python. Please get experts into the boat like Tim Peters and Christian Tismer, who both have worked on the dict implementation and the hash functions, before experimenting with ad-hoc fixes. 6. Counting collisions could solve the issue without having to change hashing. Another idea would be counting the collisions and raising an exception if the number of collisions exceed a certain threshold. Such a change would work for all hashable Python objects and protect against the attack without changing any hash function. Thanks, -- Marc-Andre Lemburg eGenix.com ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12660] test_gdb fails when installed
Vinay Sajip vinay_sa...@yahoo.co.uk added the comment: Pending the real fix, I've attached a patch to skip the test if it's not a source build. -- keywords: +patch nosy: +vinay.sajip stage: needs patch - patch review Added file: http://bugs.python.org/file24139/test-gdb-patch.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12660 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13712] test_packaging depends on hash order
Éric Araujo mer...@netwok.org added the comment: Thanks, I will check this. -- versions: +3rd party ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13712 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +Jon.Kuhn priority: normal - critical ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: 3. Changing the way strings are hashed doesn't solve the problem. Hash values of other types can easily be guessed as well, e.g. take integers which use a trivial hash function. Here's an example for integers on a 64-bit machine: g = ((x*(2**64 - 1), hash(x*(2**64 - 1))) for x in xrange(1, 100)) d = dict(g) This takes ages to complete and only uses very little memory. The input data has some 32MB if written down in decimal numbers - not all that much data either. 32397634 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Marc-Andre Lemburg m...@egenix.com added the comment: The email interface ate part of my reply: g = ((x*(2**64 - 1), hash(x*(2**64 - 1))) for x in xrange(1, 100)) s = ''.join(str(x) for x in g) len(s) 32397634 g = ((x*(2**64 - 1), hash(x*(2**64 - 1))) for x in xrange(1, 100)) d = dict(g) ... lots of time for coffee, pizza, taking a walk, etc. :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Terry J. Reedy tjre...@udel.edu added the comment: To expand on Marc-Andre's point 1: the DOS attack on web servers is possible because servers are generally dumb at the first stage. Upon receiving a post request, all key=value pairs are mindlessly packaged into a hash table that is then passed on to a page handler that typically ignores the invalid keys. However, most pages do not need any key,value pairs and forms that do have a pre-defined set of expected and recognized keys. If there were a possibly empty set of keys associated with each page, and the set were checked against posted keys, then a DOS post with thousands of effectively random keys could quickly (in O(1) time) be rejected as erroneous. In Python, the same effect could be accomplished by associating a class with slots with each page and having the server create an instance of the class. Attempts to create an undefined attribute would then raise an exception. Either way, checking input data for face validity before processing it in a time-consuming way is one possible solution for nearly all web pages and at least some other applications. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Alex Gaynor alex.gay...@gmail.com added the comment: Except, it's a totally non-scalable approach. People have vulnerabilities all over their sites which they don't realize. Some examples: django-taggit (an application I wrote for handling tags) parses tags out an input, it stores these in a set to check for duplicates. It's vulnerable. Another site I'm writing accepts JSON POSTs, you can put arbitrary keys in the JSON. It's vulnerable. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
Antoine Pitrou pit...@free.fr added the comment: The fix is quite trivial. Here is a patch + tests. -- keywords: +patch stage: needs patch - patch review Added file: http://bugs.python.org/file24140/readinto_chunked.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
Ross Lagerwall rosslagerw...@gmail.com added the comment: The patch looks right and seems to fix the issue. Thanks :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: 1. The security implications in all this is being somewhat overemphasized. There are many ways you can do a DoS attack on web servers. It's the responsibility of the used web frameworks and servers to deal with the possible cases. It's a good idea to provide some way to protect against hash collision attacks, but that will only solve one possible way of causing a resource attack on a server. There are other ways you can generate lots of CPU overhead with little data input (e.g. think of targeting the search feature on many Zope/Plone sites). In order to protect against such attacks in general, we'd have to provide a way to control CPU time and e.g. raise an exception if too much time is being spent on a simple operation such as a key insertion. This can be done using timers, signals or even under OS control. The easiest way to protect against the hash collision attack is by limiting the POST/GET/HEAD request size. For GET and HEAD, web servers normally already apply such limitations at rather low levels: http://stackoverflow.com/questions/686217/maximum-on-http-header-values So only HTTP methods which carry data in the body part of the HTTP request are effected, e.g. POST and various WebDAV methods. The second best way would be to limit the number of parameters that a web framework accepts for POST/GET/HEAD request. Depending on how parsers are implemented, applications taking XML/JSON/XML-RPC/etc. as data input may also be vulnerable, e.g. non validating XML parsers which place element attributes into a dictionary or a JSON parser that has to read the JSON version of the dict I generated earlier on. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13464] HTTPResponse is missing an implementation of readinto
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 4b21f651 by Antoine Pitrou in branch 'default': Issue #13713: fix a regression in HTTP chunked reading after 806cfe39f729 http://hg.python.org/cpython/rev/4b21f651 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13464 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 4b21f651 by Antoine Pitrou in branch 'default': Issue #13713: fix a regression in HTTP chunked reading after 806cfe39f729 http://hg.python.org/cpython/rev/4b21f651 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13713] Regression for http.client read()
Antoine Pitrou pit...@free.fr added the comment: Ok, committed! (Jon, don't worry, such things happen :-)) -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7098] g formatting for decimal types should always strip trailing zeros.
Stefan Krah stefan-use...@bytereef.org added the comment: [Mark] So I think the current code is correct. I agree with this. Currently the 'g' format is like to_sci_string() with the added possibility of adjusting the number of significant digits. It's probably hard to come up with a better way to handle this for Decimal. -- nosy: +skrah ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7098 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11648] openlog()s 'logopt' keyword broken in syslog module
Sandro Tosi sandro.t...@gmail.com added the comment: This has already been fixed with 71f7175e2b34 friends. -- nosy: +sandro.tosi resolution: - fixed stage: - committed/rejected status: open - closed versions: -Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11648 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10772] Several actions for argparse arguments missing from docs
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 278fbd7b9608 by Sandro Tosi in branch '2.7': Issue #10772: add count and help argparse action; patch by Marc Sibson http://hg.python.org/cpython/rev/278fbd7b9608 New changeset 326f755962e3 by Sandro Tosi in branch '3.2': Issue #10772: add count and help argparse action; patch by Marc Sibson http://hg.python.org/cpython/rev/326f755962e3 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10772 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10772] Several actions for argparse arguments missing from docs
Sandro Tosi sandro.t...@gmail.com added the comment: Thanks Marc for the patch, I've just committed it. -- resolution: - fixed stage: commit review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10772 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13641] decoding functions in the base64 module could accept unicode strings
Berker Peksag berker.pek...@gmail.com added the comment: Hi Antoine, I added some tests for b64decode function. Also, I wrote some tests for b32decode and b16decode functions and failed. I think my patch is not working for b32decode and b16decode functions. I'll dig into code and try to find a way. Thanks! -- Added file: http://bugs.python.org/file24141/issue13641_v2_with_tests.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13641 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: Work-in-progress patch implementing my randomized hash function (random.patch): - add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) or /dev/urandom, will a fallback on a dummy LCG if the OS urandom failed - posix.urandom() is always defined and reuses PyOS_URandom() - hash(str) is now randomized using two random Py_hash_t values: don't touch the critical loop, only add a prefix and a suffix Notes: - PyOS_URandom() reuses mostly code from Modules/posixmodule.c, except dev_urandom() and fallback_urandom() which are new - I removed memset(PyBytes_AS_STRING(result), 0, howMany); from win32_urandom() because it doesn't really change anything because the LCG is used if win32_urandom() fails - Python refuses to start if the OS urandom is missing. - Python/random.c code may be moved into Python/pythonrun.c if it is an issue to add a new file in old Python versions. - If the OS urandom fails to generate the unicode hash secret, no warning is emitted (because the LCG is used). I don't know if a warning is needed in this case. - os.urandom() argument is now a Py_ssize_t instead of an int TODO: - add an environment option to ignore the OS urandom and only uses the LCG - fix all tests broken because of the randomized hash(str) - PyOS_URandom() raises exceptions whereas it is called before creating the interpreter state. I suppose that it cannot work like this. - review and test PyOS_URandom() - review and test the new randomized hash(str) -- keywords: +patch Added file: http://bugs.python.org/file24142/random.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) or /dev/urandom Oh, OpenSSL (RAND_pseudo_bytes) should be used on Windows, Linux, Mac OS X, etc. if OpenSSL is available. I was just too lazy to add a define or pyconfig.h option to indicate if OpenSSL is available or not. FYI RAND_pseudo_bytes() is now exposed in the ssl module of Python 3.3. will a fallback on a dummy LCG It's the Linear congruent generator (LCG) used by Microsoft Visual C++ and PHP: x(n+1) = (x(n) * 214013 + 2531011) % 2^32 I only use bits 23..16 (bits 15..0 are not really random). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Antoine Pitrou pit...@free.fr added the comment: add PyOS_URandom() using CryptoGen, SSL (only on VMS!!) or /dev/urandom Oh, OpenSSL (RAND_pseudo_bytes) should be used on Windows, Linux, Mac OS X, etc. if OpenSSL is available. Apart from the large dependency, the OpenSSL license is not GPL-compatible which may be a problem for some Python-embedding applications: http://en.wikipedia.org/wiki/OpenSSL#Licensing will a fallback on a dummy LCG It's the Linear congruent generator (LCG) used by Microsoft Visual C++ and PHP: x(n+1) = (x(n) * 214013 + 2531011) % 2^32 I only use bits 23..16 (bits 15..0 are not really random). If PHP uses it, I'm confident it is secure. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: +printf(read %i bytes\n, size); Oops, I forgot a debug message. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: If PHP uses it, I'm confident it is secure. If I remember correctly, it is only used for the Windows version of PHP, but PHP doesn't implement it correctly because it uses all bits. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Paul McMillan p...@mcmillan.ws added the comment: This is not something that can be fixed by limiting the size of POST/GET. Parsing documents (even offline) can generate these problems. I can create books that calibre (a Python-based ebook format shifting tool) can't convert, but are otherwise perfectly valid for non-python devices. If I'm allowed to insert usernames into a database and you ever retrieve those in a dict, you're vulnerable. If I can post things one at a time that eventually get parsed into a dict (like the tag example), you're vulnerable. I can generate web traffic that creates log files that are unparsable (even offline) in Python if dicts are used anywhere. Any application that accepts data from users needs to be considered. Even if the web framework has a dictionary implementation that randomizes the hashes so it's not vulnerable, the entire python standard library uses dicts all over the place. If this is a problem which must be fixed by the framework, they must reinvent every standard library function they hope to use. Any non-trivial python application which parses data needs the fix. The entire standard library needs the fix if is to be relied upon by applications which accept data. It makes sense to fix Python. Of course we must fix all the basic hashing functions in python, not just the string hash. There aren't that many. Marc-Andre: If you look at my proposed code, you'll notice that we do more than simply shift the period of the hash. It's not trivial for an attacker to create colliding hash functions without knowing the key. Since speed is a concern, I think that the proposal to avoid using the random hash for short strings is a good idea. Additionally, randomizing only some of the characters in longer strings will allow us to improve security without compromising speed significantly. I suggest that we don't randomize strings shorter than 6 characters. For longer strings, we randomize the first and last 5 characters. This means we're only adding additional work to a max of 10 rounds of the hash, and only for longer strings. Collisions with the hash from short strings should be minimal. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: Since speed is a concern, I think that the proposal to avoid using the random hash for short strings is a good idea. My proposition only adds two XOR to hash(str) (outside the loop on Unicode characters), so I expect a ridiculous overhead. I don't know yet how hard it is to guess the secret from hash(str) output. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Antoine Pitrou pit...@free.fr added the comment: You aren't special casing small strings. I fear that an attacker may guess the seed from several small strings. How would (s)he do? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13699] test_gdb has recently started failing
STINNER Victor victor.stin...@haypocalc.com added the comment: New changeset dfffb293f4b3 by Vinay Sajip in branch 'default' The fix should also be applied to 3.2. -- resolution: fixed - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13699 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Christian Heimes li...@cheimes.de added the comment: Thanks Victor! - hash(str) is now randomized using two random Py_hash_t values: don't touch the critical loop, only add a prefix and a suffix At least for Python 2.x hash(str) and hash(unicode) have to yield the same result for ASCII only strings. - PyOS_URandom() raises exceptions whereas it is called before creating the interpreter state. I suppose that it cannot work like this. My patch compensates for the issue and calls Py_FatalError() when the random seed hasn't been initialized yet. You aren't special casing small strings. I fear that an attacker may guess the seed from several small strings. How about using another initial seed for strings shorter than 4 code points? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Paul McMillan p...@mcmillan.ws added the comment: My proposition only adds two XOR to hash(str) (outside the loop on Unicode characters), so I expect a ridiculous overhead. I don't know yet how hard it is to guess the secret from hash(str) output. It doesn't work much better than a single random seed. Calculating the hash of a null byte gives you the xor of your two seeds. An attacker can still cause collisions inside the vulnerable hash function, your change doesn't negate those internal collisions. Also, strings of all null bytes collide trivially. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: I fear that an attacker may guess the seed from several small strings hash(a) ^ hash(b) removes the suffix, but I don't see how to guess the prefix from this new value. It doesn't mean that it is not possible, just that I don't have a strong background in crytography :-) I don't expect that adding 2 XOR would change our dummy (fast but unsafe) hash function into a cryptographic hash function. We cannot have security for free. If we want a strong cryptographic hash function, it would be much slower (Paul wrote that it would be 4x slower). But we prefer speed over security, so we have to do compromise. I don't know if you can retreive hash values in practice. I suppose that you can only get hash(str) (size - 1) with size=size of the dict internal array, so only the lower bits. Using a large dict, you may be able to retreive more bits of the hash value. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Christian Heimes li...@cheimes.de added the comment: Given that a user has an application with an oracle function that returns the hash of a unicode string, an attacker can probe tenth of thousand one and two character unicode strings. That should give him/her enough data to calculate both seeds. hash() already gives away lots of infomration about the seeds, too. - hash() should always return 0 - for small strings we could use a different seed than for larger strings - for larger strings we could use Paul's algorithm but limit the XOR op to the first and last 16 elements instead of all elements. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Paul McMillan p...@mcmillan.ws added the comment: - for small strings we could use a different seed than for larger strings Or just leave them unseeded with our existing algorithm. Shifting them into a different part of the hash space doesn't really gain us much. - for larger strings we could use Paul's algorithm but limit the XOR op to the first and last 16 elements instead of all elements. Agreed. It does have to be both the first and the last though. We can't just do one or the other. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Christian Heimes li...@cheimes.de added the comment: Paul wrote: I suggest that we don't randomize strings shorter than 6 characters. For longer strings, we randomize the first and last 5 characters. This means we're only adding additional work to a max of 10 rounds of the hash, and only for longer strings. Collisions with the hash from short strings should be minimal. It's too surprising for developers when just the strings with 6 or more chars are randomized. Barry made a good point http://bugs.python.org/issue13703#msg150613 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: Calculating the hash of a null byte gives you the xor of your two seeds. Not directly because prefix is first multiplied by 103. So hash(\0) gives you (prefix*103) % 2^32 ^ suffix. Example: $ ./python secret={b7abfbbf, db6cbb4d} Python 3.3.0a0 (default:547e918d7bf5+, Jan 5 2012, 01:36:39) hash() 1824997618 hash(\0) -227042383 hash(\0*2) 1946249080 0xb7abfbbf ^ 0xdb6cbb4d 1824997618 (0xb7abfbbf * 103) 0x ^ 0xdb6cbb4d 4067924912 hash(\0) 0x 4067924913 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: At least for Python 2.x hash(str) and hash(unicode) have to yield the same result for ASCII only strings. Ah yes, I forgot Python 2: I wrote my patch for Python 3.3. The two hash functions should be modified to be randomized. hash() should always return 0 Ok, I can add a special case. Antoine told me that hash() gives prefix ^ suffix, which is too much information for the attacker :-) for small strings we could use a different seed than for larger strings Why? The attack doesn't work with short strings? What do you call a short string? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch version 2: - hash() is always 0 - Remove a debug message -- Added file: http://bugs.python.org/file24143/random-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Christian Heimes li...@cheimes.de added the comment: In reply to MAL's message http://bugs.python.org/issue13703#msg150616 2. Changing the semantics of hashing in a dot release is not allowed. I concur with Marc. The change is too intrusive and may cause too much trouble for the issue. Also it seems to be unnecessary for platforms with 64bit hash. Marc: Fred told me that ZODB isn't affected. One thing less to worry. ;) 5. Hashing needs to be fast. Good point, we should include Tim and Christian Tiesmer once we have a solution we can agree upon PS: I'm missing Reply to message and a threaded view for lengthy topics -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13712] pysetup create should not convert package_data to extra_files
Éric Araujo mer...@netwok.org added the comment: The bug is caused by code in packaging.create that iterates over a dict (package_data) to extend a list (extra_files). Instead of just calling sorted to make output deterministic, I’d prefer to fix that more serious behavior bug (see also #13463, #11805 and #5302 for more !fun package_data bugs). Problem is that the setup.cfg syntax does not define how to give more than one value. If it’s judged acceptable to disallow paths with embedded spaces, we could do something like this: [files] package_data = spam = first second third Otherwise we’d need to use multiple lines (requested in #5302): [files] package_data = spam = first spam = second spam = third We probably don’t want that. An intermediate idea: [files] package_data = spam = first second third Not sure this would be the nicest thing for people to write, and for us (me) to extend the setup.cfg parser for. Anyway, attached patch fixes the code so that package_data in setup.py becomes package_data in setup.cfg and adapts the tests to check that, disabling multi-value package_data for now. I tested it with distutils2 and pypy, so it should fix the hash change in your clone. -- keywords: +patch nosy: +erik.bray title: test_packaging depends on hash order - pysetup create should not convert package_data to extra_files Added file: http://bugs.python.org/file24144/fix-pysetup-create-package_data.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13712 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13714] Methods of ftplib never ends if the ip address changes
New submission from Sworddragon sworddrag...@aol.com: If a client gets a reconnect and a new ip from the provider the methods of ftplib can't handle this and are hanging in an infinite loop. For example if a file is transfered with storbinary() and the client gets a new ip address the script will never end. I'm using the Linux Kernel 3.2 on a 64 bit system and Python 2.7 is affected too. -- components: Library (Lib) messages: 150654 nosy: Sworddragon priority: normal severity: normal status: open title: Methods of ftplib never ends if the ip address changes type: behavior versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13714 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13703] Hash collision security issue
Huzaifa Sidhpurwala sidhpurwala.huza...@gmail.com added the comment: I am wondering if a CVE id has been assigned to this security issue yet? -- nosy: +Huzaifa.Sidhpurwala ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13703 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com