Re: [Speed] Median +- MAD or Mean +- std dev?

Victor Stinner Wed, 15 Mar 2017 09:34:04 -0700

2017-03-13 21:38 GMT+01:00 Antoine Pitrou <[email protected]>:
>> If the goal is to get reproductible results, Median +- MAD seems better.
>
> Getting reproducible results is only half of the goal. Getting
> meaningful (i.e. informative) results is the other half.


If the system is tuned for benchmarks (run "python3 -m perf system
tune"), you get almost no outlier on CPU-bound functions. In this
case, mean/median and stdev/MAD are similar.

The problem is when people don't tune their system to run benchmarks,
which is likely the most common case. In this case, the distribution
is never normal :-) It's always skewed (positive skew, the right part
contains more points).

Reproductibility is a very concrete and practical issue for me.


> Additionally, while mean and std dev are generally quite well
> understood, the properties of the median absolute deviation are
> generally little known.

A friend suggested me to display sigma = 1.48 * MAD, instead of
displaying directly MAD, to get a value close to the standard
deviation without outliers. I don't know if it makes sense :-)

Victor
_______________________________________________
Speed mailing list
[email protected]
https://mail.python.org/mailman/listinfo/speed

Re: [Speed] Median +- MAD or Mean +- std dev?

Reply via email to