On Mon, 4 Jul 2016 22:51:11 +0200 Victor Stinner <[email protected]> wrote: > 2016-07-04 19:49 GMT+02:00 Antoine Pitrou <[email protected]>: > >> Median +- Std dev: 256 ms +- 3 ms -> 262 ms +- 4 ms: 1.03x slower > > > > That doesn't sound like a terrific idea. Why do you think the median > > gives a more interesting figure here? > > When the distribution is uniform, mean and median are the same. In my > experience with Python benchmarks, usually the curse is skewed: the > right tail is much longer. > > When the system noise is high, the skewness is much larger. In this > case, median looks "more correct".
It "looks" more correct? Let's say your Python implementation has a flaw: it is almost always fast, but every 10 runs, it becomes 3x slower. Taking the mean will reflect the occasional slowness. Taking the median will completely hide it. Then of course, since you have several processes and several runs per process, you could try something more convoluted, such as mean-of-medians or mean-of-mins or... However, if you're concerned by system noise, there may be other ways to avoid it. For example, measure both CPU time and wall time, and if CPU time < 0.9 * wall time (for example), ignore the number and take another measurement. (this assumes all benchmarks are CPU-bound - which they should be here - and single-threaded - which they *probably* are, except in a hypothetical parallelizing Python implementation ;-))) Regards Antoine. _______________________________________________ Speed mailing list [email protected] https://mail.python.org/mailman/listinfo/speed
