Hi Paolo, Paolo Giarrusso wrote: > after the completion of our student project, I have enough experience > to say something more. > We wrote in C an interpreter for a Python subset and we could make it > much faster than the Python interpreter (50%-60% faster). That was due > to the usage of indirect threading, tagged pointers and unboxed > integers, and a real GC (a copying one, which is enough for the > current benchmarks we have - a generational GC would be more realistic > but we didn't manage to do it).
Interesting, but it sounds like you are comparing apples to oranges. What sort of subset of Python are you implementing, i.e. what things don't work? It has been shown time and time again that implementing only a subset of Python makes it possible to get interesting speedups compared to CPython. Then, as more and more features are implemented, the difference gets smaller and smaller. This was true for a number of Python implementations (e.g. IronPython). I think to get really meaningful comparisons it would be good to modify an existing Python implementation and compare that. Yes, I know this can be a lot of work. On your actual techniques used I don't have an opinion. I am rather sure that a copying GC helped performance – it definitely did for PyPy. Tagged pointers make PyPy slower, but then, we tag integers with 1, not with 0. This could be changed, wouldn't even be too much work. About better implementations of the bytecode dispatch I am unsure. Note however, that a while ago we did measurements to see how large the bytecode dispatch overhead is. I don't recall the exact number, but I think it was below 10%. That means that even if you somehow manage to reduce that to no overhead at all, you would still only get 10% performance win. Cheers, Carl Friedrich _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
