Hi all, thanks to the tips, I verified on Mac OS X a 17% slowdown, after manually taking the best times, vs Python-2.5 (32bit). Measuring on the command line would give a 57% slowdown instead, because of lack of warmup. As a matter of fact, however, pyexpat is not involved here for PyPy, and here (v1.4) it is still implemented through ctypes (in lib_pypy/pyexpat.py), and not in RPython in pypy/rlib/.
Python 2.7 may well be faster, which might explain some extra difference with Stefan's results. It looks like the two bugs should be easy to fix: - a file leak on the tested XML module, indeed - an IOException on module opening converted to "file not found" - at least in Java, file not found is a specific exception which can be distinguished from generic I/O errors. On Mon, Nov 29, 2010 at 22:29, Piotr Skamruk <piotr.skam...@gmail.com> wrote: > simplier would be set ulimit -n to 65536 (probably in > /etc/security/limits.conf) Thanks, I needed both this and the GC tips, since during a test run to run 10^4 iterations, I can't call the GC and still get meaningful results. [I'm on Mac OS X though, so ulimit -S -n 10240 is the best one can do, otherwise "Invalid argument", i.e. EINVAL, results]. Additionally, I just discovered that the ImportError on "import linecache" looks filehandle-related as well, because changing the ulimit changes the iteration count triggering the error, so it's likely an effect of the same bug. Still, the original error message should be preserved, and this should be easy to fix. In these conditions, my best results after warming up are: 0.358 ms PyPy-JIT-32bit (see below for JIT logs) 0.305 ms CPython-2.5-32bit 0.269 ms CPython-2.6-64bit 0.553 ms PyPy-64bit-noJIT, rev 79307, 21 Nov 2010 which means a 17% slowdown on comparable setups, rather than a 2x slowdown; measuring with timeit on the cmd line, instead, would give a 57% slowdown. All this is on a very small input file, the one I attached before. That's for the total of 1000 iterations, on a Core 2 Duo 2.6GHz. I don't report the average because: a) it is difficult to get something significant anyway (I don't want to code confidence intervals, and automated tools wouldn't call GC appropriately) b) I expect the deviation to be due more to unrelated load on my laptop (around 12-18% CPU) than to actual spread of the runtime. I set PYPYLOG='jit-summary:-' before the PyPy-JIT run and got this - I hope somebody can check from this whether the JIT is working successfully. [f2dd1fbaa1c2] {jit-summary Tracing: 25 0.163456 Backend: 23 0.017392 Running asm: 191214 Blackhole: 2012 TOTAL: 502.543032 ops: 68338 recorded ops: 32764 calls: 1759 guards: 18005 opt ops: 2757 opt guards: 696 forcings: 111 abort: trace too long: 2 abort: compiling: 0 abort: vable escape: 0 nvirtuals: 6693 nvholes: 1059 nvreused: 3979 Total # of loops: 18 Total # of bridges: 6 Freed # of loops: 0 Freed # of bridges: 0 [f2dd1fc141a8] jit-summary} Best regards. > 2010/11/29 Amaury Forgeot d'Arc <amaur...@gmail.com>: >> 2010/11/29 Paolo Giarrusso <p.giarru...@gmail.com> >>> >>> Inspection of the pypy process confirms a leak of file handles to the >>> XML files. Whether it is GC not being invoked, a missing destructor, >>> or simply because the code should release file handles, I dunno. Is >>> there a way to trigger explicit GC to workaround such issues? >> >> As usual: >> import gc >> gc.collect() >> Calling gc.collect() is indeed a good idea if the code does not explicitly >> close the files. -- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/ _______________________________________________ pypy-...@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev