New submission from Jake:

In the statistics module documentation, there is a note that states that 

"The mean is strongly affected by outliers and is not a robust estimator for 
central location: the mean is not necessarily a typical example of the data 
points. For more robust, although less efficient, measures of central location, 
see median() and mode()"

https://docs.python.org/3/library/statistics.html

While I appreciate the intention, this is quite misleading.  The implication is 
that the mean, median and mode are different ways to estimate one "central 
location", however, in reality they are very different things (albeit which 
refer to a similar notion).

The sample mean is an unbiased estimator of the true mean but it need not be 
unbiased as an estimator of the true median or modes and vice versa for the 
median and mode.

To make this clearer I would rephrase to 

"The mean is strongly affected by outliers and is not necessarily 
representative of the central tendency of the data. For cases with large 
outliers or very low sample size, see median() and mode()"

Apologies if this is seen as frivolous, but statistics can be hard enough to 
remain very clear about even when the words are used precisely.

----------
assignee: docs@python
components: Documentation
messages: 236612
nosy: Journeyman08, docs@python
priority: normal
severity: normal
status: open
title: Misleading note in Statistics module documentation
type: enhancement
versions: Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23522>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to