[issue16047] Tools/freeze no longer works in Python 3

2012-09-25 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg: The freeze tool used for compiling Python binaries with frozen modules no longer works with Python 3.x. It looks like it was never updated to the various path and symbols changes introduced with PEP 3149 (ABI tags) in Python 3.2. Even with lots

[issue16027] pkgutil doesn't support frozen modules

2012-09-24 Thread Marc-Andre Lemburg
New submission from Marc-Andre Lemburg: pkgutil is used by runpy to run Python modules that are loaded via the -m command line switch. Unfortunately, this doesn't work for frozen modules, since pkgutil doesn't know how to load their code object (this can be had via imp.get_code_object

[issue16027] pkgutil doesn't support frozen modules

2012-09-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Correction: the helper function is called imp.get_frozen_object(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16027

[issue15443] datetime module has no support for nanoseconds

2012-07-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Vincenzo Ampolo wrote: Vincenzo Ampolo vincenzo.amp...@gmail.com added the comment: This is a real use case I'm working with that needs nanosecond precision and lead me in submitting this request: most OSes let users capture network

[issue15444] Incorrectly written contributor's names

2012-07-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Thank you for taking the initiative. Regarding use of UTF-8 for text files: I think we ought to acknowledge that UTF-8 has become the defacto standard for non-ASCII text files by now and with Python 3 being all Unicode, it feels silly

[issue15443] datetime module has no support for nanoseconds

2012-07-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Alexander Belopolsky alexander.belopol...@gmail.com added the comment: On Wed, Jul 25, 2012 at 4:17 AM, Marc-Andre Lemburg rep...@bugs.python.org wrote: ... full C double precision for the time part

[issue15443] datetime module has no support for nanoseconds

2012-07-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky alexander.belopol...@gmail.com added the comment: On Wed, Jul 25, 2012 at 4:17 AM, Marc-Andre Lemburg rep...@bugs.python.org wrote: ... full C double precision for the time part

[issue15443] datetime module has no support for nanoseconds

2012-07-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: [Roundup's email interface again...] x = 86400.0 x == x + 1e-9 False x == x + 1e-10 False x == x + 1e-11 False x == x + 1e-12 True -- ___ Python tracker rep...@bugs.python.org http

[issue15443] datetime module has no support for nanoseconds

2012-07-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Vincenzo Ampolo wrote: As long as computers evolve time management becomes more precise and more granular. Unfortunately the standard datetime module is not able to deal with nanoseconds even if OSes are able to. For example if i do

[issue15369] pybench and test.pystone poorly documented

2012-07-17 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: Brett Cannon br...@python.org added the comment: I disagree. They are outdated benchmarks and probably should either be removed or left undocumented. Proper testing of performance is with the Unladen Swallow

[issue1294959] Problems with /usr/lib64 builds.

2012-05-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Éric Araujo wrote: Éric Araujo mer...@netwok.org added the comment: On Mar 29, 2011, at 10:12 PM, Matthias Klose wrote: no, it looks for headers and libraries in more directories. But really, this whole testing for paths is wrong

[issue14572] 2.7.3: sqlite module does not build on centos 5 and Mac OS X 10.4

2012-05-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mac OS X 10.4 is also affected and for the same reason. SQLite builds fine for Python 2.5 and 2.6, but not for 2.7. -- nosy: +lemburg title: 2.7.3: sqlite module does not build on centos 5 - 2.7.3: sqlite module does not build

[issue14605] Make import machinery explicit

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: You can see a little discussion in http://bugs.python.org/issue14642, but it has been discussed elsewhere and the automatic rebuilding was preferred (but it is not a requirement to build as importlib.h is in hg

[issue14657] Avoid two importlib copies

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Code to detect whether you're running off a checkout vs. a normal installation by looking at even more directories ? I don't see any in getpath.c (and that's good

[issue14657] Avoid two importlib copies

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Look for pybuilddir.txt. Oh dear. Another one of those hacks... why wasn't this done using constants passed in by the configure script and simple string

[issue14657] Avoid two importlib copies

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: The question pybuildir.txt apparently tries to solve is whether Python is running from the build dir or not. It's not whether Python was installed

[issue14657] Avoid two importlib copies

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Nick Coghlan wrote: Nick Coghlan ncogh...@gmail.com added the comment: At the very least, failing to regenerate importlib.h shouldn't be a fatal build error. It should just run with what its got, and hopefully you will get a working

[issue14657] Avoid two importlib copies

2012-04-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Marc-Andre Lemburg m...@egenix.com added the comment: Nick Coghlan wrote: Nick Coghlan ncogh...@gmail.com added the comment: At the very least, failing to regenerate importlib.h shouldn't be a fatal build

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: This would also mean that changes to importlib._bootstrap would actually take effect for user code almost immediately, *without* rebuilding Python, as the frozen

[issue14605] Make import machinery explicit

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: I am not exposing SourcelessFileLoader because importlib publicly tries to discourage the shipping of .pyc files w/o their corresponding source files. Otherwise all objects as used by importlib for performing imports

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: test me thod. Another option is we hide the source as _importlib or something to allow direct importation w/o any tricks under a protected name. Using the freeze everything approach you make things easier for the implementation, since you

[issue14605] Make import machinery explicit

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: That initial comment is out-of-date. If you look that the commit I made I documented importlib.machinery._SourcelessFileLoader. I am continuing the discouragement of using bytecode files as an obfuscation technique

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: Brett Cannon br...@python.org added the comment: I don't quite follow what you are suggesting, MAL. Are you saying to freeze importlib.__init__ and importlib._bootstrap and somehow have improtlib.__init__ choose

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: Brett Cannon br...@python.org added the comment: So basically if you are running in a checkout, grab the source file and compile it manually since its location is essentially hard-coded and thus you don't need

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: Modules/getpath.c seems to be where the C code does it when getting paths for sys.path. So it would be possible to use that same algorithm to set some sys attribute (e.g. in_checkout or something) much like

[issue14657] Avoid two importlib copies

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Adding more cruft to getpath.c or similar routines is just going to slow down startup time even more... The code is already there. Code to detect whether you're running off a checkout vs. a normal installation

[issue14605] Make import machinery explicit

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Brett Cannon wrote: I documented it explicitly so people can use it if they so choose (e.g. look at sys._getframe()). If you want to change this that's fine, but I am personally not going to put the effort in to rename the class

[issue14605] Make import machinery explicit

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: R. David Murray wrote: R. David Murray rdmur...@bitdance.com added the comment: Hmm. Some at least of the buildbots have failed to build after that patch: ./python ./Python/freeze_importlib.py \ ./Lib/importlib/_bootstrap.py

[issue14605] Make import machinery explicit

2012-04-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Looking further I found this line in the Makefile: # Importlib Python/importlib.h: $(srcdir)/Lib/importlib/_bootstrap.py $(srcdir

[issue14423] Getting the starting date of iso week from a week number and a year.

2012-04-22 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: By the way, I don't think the algorithm used in the current patch is correct. For 'date.from_iso_week(2009, 1)' I get 2009/1/1, which was a Thursday. The documentation seems to indicate that a Monday should

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Ned Deily wrote: Ned Deily n...@acm.org added the comment: That's unfortunate. But the documented location for customize_compiler is and, AFAIK, had always been in distutils.sysconfig. It was an inadvertent consequence of the bad

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-20 Thread Marc-Andre Lemburg
Changes by Marc-Andre Lemburg m...@egenix.com: -- resolution: fixed - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13994 ___ ___ Python-bugs

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Ned Deily wrote: And to recap the history here, there was a change in direction for Distutils during the 2.7 development cycle, as decided at the 2010 language summit, in particular to revert feature changes in Distutils for 2.7 to its

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Ned Deily n...@acm.org added the comment: That's unfortunate. But the documented location for customize_compiler is and, AFAIK, had always been in distutils.sysconfig. It was an inadvertent consequence

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: ink it is not unlikely that you *are* the only ones affected by it. With in the wild I'm referring to the function being released in the ccompiler not only in alpha releases but also in the beta releases, the 2.7, 2.7.1 and 2.7.2 release

[issue14619] Enhanced variable substitution for databases

2012-04-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Raymond, the variable substitution is normally done by the database and not the Python database modules, so you'd have to ask the database maintainers for assistance. The qmark ('?') parameter style is part of the ODBC standard, so it's

[issue14428] Implementation of the PEP 418

2012-04-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: Please leave the pybench default timers unchanged in case the new APIs are not available. Ok, done in the new patch: perf_counter_process_time-2.patch

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: The patch broke egenix-mx-base, since it relies on the customize_compiler() being available in distutils.ccompiler: https://www.egenix.com/mailman-archives/egenix-users/2012-April/114838.html If you make such changes to dot releases

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Here's the quote from mxSetup.py: # distutils changed a lot in Python 2.7 due to many # distutils.sysconfig APIs having been moved to the new # (top-level) sysconfig module. from sysconfig import

[issue13994] incomplete revert in 2.7 Distutils left two copies of customize_compiler

2012-04-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Éric Araujo wrote: Sorry for not thinking about this. I’ll be more careful. No need to be sorry; these things can happen. What I don't understand is this line in the news section: Complete the revert back to only having one

[issue14428] Implementation of the PEP 418

2012-04-18 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Please leave the pybench default timers unchanged in case the new APIs are not available. The perf_counter_process_time.patch currently changes them, even though the new APIs are not available on older Python releases, thus breaking pybench

[issue14428] Implementation of the PEP 418

2012-04-13 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: perf_counter_process_time.patch: replace time.clock if windows else time.time with time.perf_counter, and getrusage/clock with time.process_time

[issue14423] Getting the starting date of iso week from a week number and a year.

2012-04-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky alexander.belopol...@gmail.com added the comment: Before you invest in a C version, let's discuss whether this feature is desirable. The proposed function implements a very simple

[issue14423] Getting the starting date of iso week from a week number and a year.

2012-04-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: Alexander Belopolsky alexander.belopol...@gmail.com added the comment: On Mon, Apr 9, 2012 at 6:20 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: Which is wrong, since the start of the first ISO week

[issue14428] Implementation of the PEP 418

2012-04-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Hi Victor, I think you need to reconsider the time.steady() name you're using in the PEP. For practical purposes, it's better to call it time.monotonic() and only make the function available if the OS provides a monotonic clock

[issue14428] Implementation of the PEP 418

2012-04-03 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: I think you need to reconsider the time.steady() name you're using in the PEP. For practical purposes, it's better to call it time.monotonic() I

[issue13608] remove born-deprecated PyUnicode_AsUnicodeAndSize

2012-03-27 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: The Py_UNICODE* type is deprecated but since Python 3.3, Py_UNICODE=wchar_t and wchar_t* is a common type on Windows. PyUnicode_AsUnicodeAndSize

[issue14397] Use GetTickCount/GetTickCount64 instead of QueryPerformanceCounter for monotonic clock

2012-03-25 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Yury Selivanov wrote: Yury Selivanov yseliva...@gmail.com added the comment: A monotonic clock is not suitable for measuring durations, as it may still jump forward. A steady clock will not. Well, Victor's implementation of 'steady

[issue14309] Deprecate time.clock()

2012-03-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: There's no other single function providing the same functionality time.clock() is not portable: it is a different clock depending on the OS. To write

[issue14309] Deprecate time.clock()

2012-03-16 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: time.clock() has been in use for ages in many many scripts. We don't want to carelessly break all those. I don't want to remove the function, just mark

[issue14309] Deprecate time.clock()

2012-03-15 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: New submission from STINNER Victor victor.stin...@gmail.com: Python 3.3 has 3 functions to get time: - time.clock() - time.steady() - time.time() Antoine Pitrou suggested to deprecated time.clock

[issue7652] Merge C version of decimal into py3k.

2012-03-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Does the C version have a C API importable as capsule ? If not, could you add one and a decimal.h to go with it ? This makes integration in 3rd party modules a lot easier. Thanks, -- Marc-Andre Lemburg eGenix.com

[issue13703] Hash collision security issue

2012-02-21 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Gregory P. Smith wrote: Gregory P. Smith g...@krypto.org added the comment: Question: Should sys.flags.hash_randomization be True (1) when PYTHONHASHSEED=0? It is now. The flag should probably be removed - simply because the env var

[issue13703] Hash collision security issue

2012-02-21 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@gmail.com added the comment: Question: Should sys.flags.hash_randomization be True (1) when PYTHONHASHSEED=0? It is now. Saying yes working as intended is fine by me

[issue13703] Hash collision security issue

2012-02-13 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Dave Malcolm wrote: [new patch] Please change how the env vars work as discussed earlier on this ticket. Quick summary: We only need one env var for the randomization logic: PYTHONHASHSEED. If not set, 0 is used as seed. If set

[issue13703] Hash collision security issue

2012-02-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Dave Malcolm wrote: If anyone is aware of an attack via numeric hashing that's actually possible, please let me know (privately). I believe only specific apps could be affected, and I'm not aware of any such specific apps. I'm not sure

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: In a security fix release, we shouldn't change the linkage procedures, so I recommend that the LoadLibrary dance remains. So the overhead

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: The simple collision counting approach leaves a gaping hole open, as demonstrated by Frank. Could you elaborate on this ? Note that I've updated the collision counting patch to cover both possible attack cases I

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Jim Jewett wrote: Jim Jewett jimjjew...@gmail.com added the comment: On Mon, Feb 6, 2012 at 8:12 AM, Marc-Andre Lemburg rep...@bugs.python.org wrote: Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Jim Jewett wrote: BTW: If you set the limit N to e.g. 100 (which is reasonable given Victor's and my tests), Agreed. Frankly, I think 5 would be more than reasonable so long as there is a fallback. the time it takes to process one

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Dave Malcolm wrote: So the overhead in startup time is not an issue? It is an issue. Not only in terms of startup time, but also ... because randomization per default makes Python behave in non-deterministc ways - which is not what

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Dave Malcolm wrote: The release managers have pronounced: http://mail.python.org/pipermail/python-dev/2012-January/115892.html Quoting that email: 1. Simple hash randomization is the way to go. We think this has

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Right, but that doesn't contradict what I wrote about adding env vars to fix a seed and optionally enable using a random seed, or adding collision counting

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Gregory P. Smith wrote: Gregory P. Smith g...@krypto.org added the comment: The release managers have pronounced: http://mail.python.org/pipermail/python-dev/2012-January/115892.html Quoting that email: 1. Simple hash randomization

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alex Gaynor wrote: Can't randomization just be applied to integers as well? A simple seed xor'ed with the hash won't work, since the attacks I posted will continue to work (just colliding on a different hash value). Using a more elaborate

[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alex Gaynor wrote: There's no need to cover any container types, because if their constituent types are securely hashable then they will be as well. And of course if the constituent types are unsecure then they're directly vulnerable. I

[issue13703] Hash collision security issue

2012-01-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Dave Malcolm wrote: Dave Malcolm dmalc...@redhat.com added the comment: On Fri, 2012-01-06 at 12:52 +, Marc-Andre Lemburg wrote: Marc-Andre Lemburg m...@egenix.com added the comment: Demo patch implementing the collision limit

[issue13703] Hash collision security issue

2012-01-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alex Gaynor wrote: I'm able to put N pieces of data into the database on successive requests, but then *rendering* that data puts it in a dictionary, which renders that page unviewable by anyone. I think you're asking a bit much here

[issue13703] Hash collision security issue

2012-01-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Here's a version of the collision counting patch that takes both hash and slot collisions into account. I've also added a test script which demonstrates both types of collisions using integer objects (since it's trivial to calculate

[issue13703] Hash collision security issue

2012-01-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: I've also added a test script which demonstrates both types of collisions using integer objects (since it's trivial to calculate their hashes). I forgot to mention: the test script is for 64-bit platforms. It's easy to adapt it to 32-bit

[issue13703] Hash collision security issue

2012-01-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: To see the collision counting, enable the DEBUG_DICT_COLLISIONS macro variable. Running (part of (*)) the test suite with debugging enabled on a 64-bit machine shows that slot collisions are much more frequent than hash collisions, which

[issue13703] Hash collision security issue

2012-01-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Charles-François Natali wrote: Anyway, I still think that the hash randomization is the right way to go, simply because it does solve the problem, whereas the collision counting doesn't: Martin made a very good point on python-dev

[issue13703] Hash collision security issue

2012-01-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: ... So I expect something similar in applications: no change in the applications, but a lot of hacks/tricks in tests. Tests usually check output of an application given a certain input. If those fail

[issue13703] Hash collision security issue

2012-01-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: Please note, that you'd have to extend the randomization to all other Python data types as well in order to reach the same level of security as the collision

[issue13703] Hash collision security issue

2012-01-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: I tried the collision counting with a low number of collisions: ... no false positives with a limit of 50 collisions ... Thanks for running those tests. Looks like a limit lower than 1000 would already do just fine

[issue13703] Hash collision security issue

2012-01-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: [Reposting, since roundup removed part of the Python output] M.-A. Lemburg wrote: Note that the integer attack also applies to other number types in Python: -- (hash(3), hash(3.0), hash(3+0j) (3, 3, 3) See Tim's post I referenced

[issue13703] Hash collision security issue

2012-01-19 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Frank Sievertsen wrote: Frank Sievertsen pyt...@sievertsen.de added the comment: The suffix only introduces a constant change in all hash values output, so even if you don't know the suffix, you can still generate data sets

[issue13703] Hash collision security issue

2012-01-18 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: Patch version 7: - Make PyOS_URandom() private (renamed to _PyOS_URandom) - os.urandom() releases the GIL for I/O operation for its implementation reading /dev/urandom - move _Py_unicode_hash_secret_t

[issue13703] Hash collision security issue

2012-01-16 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Eric Snow wrote: Eric Snow ericsnowcurren...@gmail.com added the comment: The vulnerability is known since 2003 (Usenix 2003): read Denial of Service via Algorithmic Complexity Attacks by Scott A. Crosby and Dan S. Wallach. Crosby

[issue13703] Hash collision security issue

2012-01-12 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Frank Sievertsen wrote: I don't want my software to stop working because someone managed to enter 1000 bad strings into it. Think of a software that handles names of customers or filenames. We don't want it to break completely just

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: Patch version 5 fixes test_unicode for 64-bit system. Victor, I don't think the randomization idea is going anywhere. The code has many issues: * it is exceedingly complex * the method would need to be implemented

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: * it is exceedingly complex Which part exactly? For hash(str), it just add two extra XOR. I'm not talking specifically about your patch

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Shannon wrote: Mark Shannon m...@hotpy.org added the comment: * the method would need to be implemented for all hashable Python types It was already discussed, and it was said that only hash(str) need to be modified. Really

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: OTOH, the collision counting patch is very simple, doesn't have the performance issues and provides real protection against the attack. I don't know about real

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Mark Dickinson wrote: Mark Dickinson dicki...@gmail.com added the comment: [Antoine] Also, how about false positives? Having legitimate programs break because of legitimate data would be a disaster. This worries me, too. [MAL

[issue13703] Hash collision security issue

2012-01-11 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Antoine Pitrou wrote: Antoine Pitrou pit...@free.fr added the comment: On my slow dev machine 1000 collisions run in around 22ms: python2.7 -m timeit -n 100 dict((x*(2**64 - 1), 1) for x in xrange(1, 1000)) 100 loops, best of 3

[issue13703] Hash collision security issue

2012-01-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: Marc-Andre Lemburg m...@egenix.com added the comment: Christian Heimes wrote: Marc-Andre: Have you profiled your suggestion? I'm interested in the speed implications. My gut feeling is that your idea could

[issue13703] Hash collision security issue

2012-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Tim Peters wrote: Tim Peters tim.pet...@gmail.com added the comment: [Marc-Andre] BTW: I wonder how long it's going to take before someone figures out that our merge sort based list.sort() is vulnerable as well... its worst- case

[issue13703] Hash collision security issue

2012-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Christian Heimes wrote: Marc-Andre: Have you profiled your suggestion? I'm interested in the speed implications. My gut feeling is that your idea could be slower, since you have added more instructions to a tight loop, that is execute

[issue13703] Hash collision security issue

2012-01-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Paul McMillan wrote: I'll upload a patch that demonstrates the collisions counting strategy to show that detecting the problem is easy. Whether just raising an exception is a good idea, is another issue. I'm in cautious agreement

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Before continuing down the road of adding randomness to hash functions, please have a good read of the existing dictionary implementation: Major subtleties ahead: Most hash schemes depend on having a good hash function, in the sense

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Demo patch implementing the collision limit idea for Python 2.7. -- Added file: http://bugs.python.org/file24151/hash-attack.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: The hash-attack.patch solves the problem for the integer case I posted earlier on and doesn't cause any problems with the test suite. Traceback (most recent call last): File stdin, line 1, in module KeyError: 'too many hash collisions

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Stupid email interface again... here's the full text: The hash-attack.patch solves the problem for the integer case I posted earlier on and doesn't cause any problems with the test suite. d = dict((x*(2**64 - 1), hash(x*(2**64 - 1))) for x

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: hash-attack.patch does never decrement the collision counter. Why should it ? It's only used as local variable in the lookup function. Note

[issue13703] Hash collision security issue

2012-01-06 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Here's an example of hash-attack.patch finding an on-purpose programming error (hashing all objects to the same value): http://stackoverflow.com/questions/4865325/counting-collisions-in-a-python-dictionary (see the second example on the page

[issue13703] Hash collision security issue

2012-01-05 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Paul McMillan wrote: This is not something that can be fixed by limiting the size of POST/GET. Parsing documents (even offline) can generate these problems. I can create books that calibre (a Python-based ebook format shifting tool

[issue13707] Clarify hash() constancy period

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Terry J. Reedy wrote: Terry J. Reedy tjre...@udel.edu added the comment: Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Some comments: 1. The security implications in all this is being somewhat overemphasized. There are many ways you can do a DoS attack on web servers. It's the responsibility of the used web frameworks and servers to deal with the possible

[issue13703] Hash collision security issue

2012-01-04 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Marc-Andre Lemburg wrote: 3. Changing the way strings are hashed doesn't solve the problem. Hash values of other types can easily be guessed as well, e.g. take integers which use a trivial hash function. Here's an example for integers

<    4   5   6   7   8   9   10   11   12   13   >