2017-03-14 8:14 GMT+01:00 Serhiy Storchaka <storch...@gmail.com>:
> Std dev is well understood for the distribution close to normal. But when
> the distribution is too skewed or multimodal (as in your quick example)
> common assumptions (that 2/3 of samples are in the range of the std dev, 95%
> of samples are in the range of two std devs, 99% of samples are in the range
> of three std devs) are no longer valid.

The Python timeit module only displays the minimum. I chose to display
also the standard deviation in perf to give an idea of the stability
of the benchmark.

For example, "10 +- 1 ms" is quite stable, whereas "10 ms +- 15 ms"
seems not reliable at all.

MAD contains 50% of samples, whereas std dev contains 66% of samples.
If I only look at percentage, I prefer std dev because it gives a
better estimation of the stability of the benchmark.

Victor
_______________________________________________
Speed mailing list
Speed@python.org
https://mail.python.org/mailman/listinfo/speed

Reply via email to