Re: [Speed] New CPython benchmark suite based on perf

Antoine Pitrou Tue, 05 Jul 2016 01:10:30 -0700

On Mon, 4 Jul 2016 22:51:11 +0200
Victor Stinner <[email protected]>
wrote:
> 2016-07-04 19:49 GMT+02:00 Antoine Pitrou <[email protected]>:
> >>    Median +- Std dev: 256 ms +- 3 ms -> 262 ms +- 4 ms: 1.03x slower
> >
> > That doesn't sound like a terrific idea. Why do you think the median
> > gives a more interesting figure here?
> 
> When the distribution is uniform, mean and median are the same. In my
> experience with Python benchmarks, usually the curse is skewed: the
> right tail is much longer.
> 
> When the system noise is high, the skewness is much larger. In this
> case, median looks "more correct".


It "looks" more correct?

Let's say your Python implementation has a flaw: it is almost always
fast, but every 10 runs, it becomes 3x slower.  Taking the mean will
reflect the occasional slowness.  Taking the median will completely
hide it.

Then of course, since you have several processes and several runs per
process, you could try something more convoluted, such as
mean-of-medians or mean-of-mins or...

However, if you're concerned by system noise, there may be other ways
to avoid it. For example, measure both CPU time and wall time, and if
CPU time < 0.9 * wall time (for example), ignore the number and take
another measurement.

(this assumes all benchmarks are CPU-bound - which they should be here
- and single-threaded - which they *probably* are, except in a
hypothetical parallelizing Python implementation ;-)))

Regards

Antoine.


_______________________________________________
Speed mailing list
[email protected]
https://mail.python.org/mailman/listinfo/speed

Re: [Speed] New CPython benchmark suite based on perf

Reply via email to