Steven D'Aprano added the comment:
> * Display the average, rather than the minimum, of the timings *and*
> display the standard deviation. It should help a little bit to get
> more reproductible results.
I'm still not convinced that the average is the right statistic to use
here. I cannot comment about Victor's perf project, but for timeit, it
seems to me that Tim's original warning that the mean is not useful is
Fundamentally, the problem with taking an average is that the timing
errors are all one sided. If the unknown "true" or "real" time taken by
a piece of code is T, then the random error epsilon is always positive:
we're measuring T + ε, not T ± ε.
If the errors are evenly divided into positive and negative, then on
average the mean() or median() of the measurements will tend to cancel
the errors, and you get a good estimate of T. But if the errors are all
one-sided, then they don't cancel and you are actually estimating T plus
some unknown, average error. In that case, min() is the estimate which
is closest to T.
Unless you know that average error is tiny compared to T, I don't think
the average is very useful. Since these are typically micro-benchmarks,
the error is often quite large relative to the unknown T.
> * Change the default repeat from 3 to 5 to have a better distribution
> of timings. It makes the timeit CLI 66% slower (ex: 1 second instead
> of 600 ms). That's the price of stable benchmarks :-)
I nearly always run with repeat=5, so I agree with this.
> * Don't disable the garbage collector anymore! Disabling the GC is not
> fair: real applications use it.
But that's just adding noise: you're not timing code snippet, you're
timing code snippet plus garbage collector.
I disagree with this change, although I would accept it if there was an
optional flag to control the gc.
> * autorange: start with 1 loop instead of 10 for slow benchmarks like
That seems reasonable.
> * Display large number of loops as power of 10 for readability, ex:
> "10^6" instead of "1000000". Also accept "10^6" syntax for the --num
Shouldn't we use 10**6 or 1e6 rather than bitwise XOR? :-)
This is aimed at Python programmers. We expect ** to mean
exponentiation, not ^.
> * Add support for "ns" unit: nanoseconds (10^-9 second)
title: Enhance the timeit module: display average +- std dev instead of minimum
-> Enhance the timeit module
Python tracker <rep...@bugs.python.org>
Python-bugs-list mailing list