2017-03-14 8:14 GMT+01:00 Serhiy Storchaka <storch...@gmail.com>: > Std dev is well understood for the distribution close to normal. But when > the distribution is too skewed or multimodal (as in your quick example) > common assumptions (that 2/3 of samples are in the range of the std dev, 95% > of samples are in the range of two std devs, 99% of samples are in the range > of three std devs) are no longer valid.
The Python timeit module only displays the minimum. I chose to display also the standard deviation in perf to give an idea of the stability of the benchmark. For example, "10 +- 1 ms" is quite stable, whereas "10 ms +- 15 ms" seems not reliable at all. MAD contains 50% of samples, whereas std dev contains 66% of samples. If I only look at percentage, I prefer std dev because it gives a better estimation of the stability of the benchmark. Victor _______________________________________________ Speed mailing list Speed@python.org https://mail.python.org/mailman/listinfo/speed