On 22.12.20 16:34, PIERRE AUGIER wrote:
Here, it is really about what can be done with PyPy, nowadays and in future.
Hi Pierre, A few somewhat random comments from me. First note is that you shouldn't run two different implementations that you are comparing (Point3D and Point4D in this case) within the same process, since they can influence each other. If I run them in the same process I get this: Point3D: 11.426 ms Point4D: 21.572 ms in separate processes the latter speeds up: Point4D: 13.136 ms (but it doesn't become faster than Point4D, indeed because we don't have any real SIMD support in the JIT.) Next: some information about how to look at the generated code with PyPy. What I do is look at the JIT IR (which is very close to machine code, but one abstraction level above it). You get it like this: PYPYLOG=jit-log-opt,jit-summary,jit-backend-counts:out pypy3 microbench_pypy4.py This produces a file called 'out' with different sections. I usually start by looking at the bottom, which shows how often each trace is entered. This way, you can find the hottest trace: [26f0c8566379] {jit-backend-counts ... TargetToken(140179837690368):43692970 TargetToken(140179837690448):74923530 ... [26f0c8567905] jit-backend-counts} Now I search for the address of the hottest trace to find its IR. The IR shows traced Python bycodes interspersed with IR instructions (takes a bit of time to get used to reading it, but it's not super hard). Looking through that it's my opinion that the trace looks quite good. There are many small inefficiencies (a bit too much pointer chasing, a bit too much type checking everywhere, a few allocations that aren't necessary), but no single thing missed optimization that could immediately give a 5x speedup. Which also follows my expectations of how I suspect a shootout between Julia and PyPy to end up: PyPy is much faster than CPython for algorithmic pure Python code (~150x on my laptop! that's really good :-)). But it can't really beat a "serious" ahead-of-time compiler for a statically typed language that specifically targets numerical code. That is for several reasons, the most important ones being that 1) PyPy has a lot less time to produce code given that it does it at runtime 2) PyPy has to support the full dynamically typed language Python where really random things can be done at runtime and PyPy must still always observe the Python semantics. That said, I can understand that 5x slower is still a somewhat disappointing result and I suspect given enough effort we could maybe get it down to around 3x slower. Cheers, Carl Friedrich _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev