On 02/26/2010 01:30 PM, Stefan Behnel wrote: > Carl Friedrich Bolz, 26.02.2010 11:25: >> http://buytaert.net/files/oopsla07-georges.pdf > > It's sad that the paper doesn't try to understand *why* others use > different ways to benchmark.
I guess those others should write their own papers (or blog posts or whatever) :-). If you know any well-written ones, I would be very interested. > They even admit at the end that their > statistical approach is only really interesting when the differences are > small enough, not mentioning at that point that the system must be complex > enough also, such as the Sun JVM. However, if the differences are small and > the benchmarked system is complex, it's best to question the benchmark in > the first place, rather than the statistics that lead to its results. [...] In my opinion there are probably not really many non-complex systems around nowadays, at least if we are talking about typical "desktop" systems. There is also a lot of noise on the CPU level, with caches, out of order execution, not even talking about the OS. And while PyPy is not quite as complex as a JVM, it is certainly moving in this direction. So even if your benchmark itself is a simple piece of Python code, the whole system that you invoke is still complex. Cheers, Carl Friedrich _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
