On 14.03.17 19:05, Antoine Pitrou wrote:
On Tue, 14 Mar 2017 09:14:45 +0200
Serhiy Storchaka <storch...@gmail.com>
wrote:
The median tells you that results of a half of runs will be less than
the median and results of other half will be larger. This is pretty
informative and even more informative than the mean for some
applications.
How so? Whether a measurement is below or above the median is a
pointless piece of information in itself, because you don't know by how
much. If a sample is 0.05% below the median, it might just as well be
0.05% above for all I care. If half of the samples are 1% below the
median and half of the samples are 50% above, it's not the same thing
at all as if half of the samples are 50% below and half of the samples
are 1% above. Yet "median +/- MAD" gives the exact same results in
both cases.
"half of the samples are 1% below the median and half of the samples are
50% above" -- this is unrealistic example. In real examples samples are
distributed around some point, with the skew and outliers. The median is
close to the mean, but less affected by outliers. For benchmarking
purpose the absolute value is not important. The change between two
measurements of two builds is important. The median is more stable and
that means that we have less chance to get the false result.
_______________________________________________
Speed mailing list
Speed@python.org
https://mail.python.org/mailman/listinfo/speed